***Inputs**: pieces of all the previous workflows + output of binning dereplication (to link a bin with a CRISPR, it is needed to at least have contigs assigned to bins)
***Steps**
- blast of bins against CRISPR flanks (from workflow **CrisprPrediction**)
- blast of bins against CRISPR repeats (from workflow **CrisprPrediction**)
- identification of hosts: filter by matches with flank and repeat sequences, and filtering by coverage and identity
- link CRISPR spacers to protospacers (formatting and adding info to protospacers identified in workflow **CrisprPrediction**)
@@ -7,7 +7,7 @@ The CRISPR-MGE pipeline identifies the CRISPRs from reads and contigs, and invas
*[iMGEs prediction](CRISPR-prediction.md)
*[iMGEs dereplication](MGE-dereplication.md): collection of predicted MGEs and redundancy removal.
*[iMGEs remapping](MGE-remapping.md): remapping of all the metagenomic and metatranscriptomic reads to the iMGE sequences.
***MgeHostLink**: identification of candidate hosts, their spacers composition and the link with the protospacer-containing contigs.
*[iMGE-Hosts CRISPR-mediated links](MGE-host-link.md): identification of candidate hosts, their spacers composition and the link with the protospacer-containing contigs.
## Dependencies
-[CRASS](http://ctskennerton.github.io/crass/)
...
...
@@ -23,51 +23,6 @@ The CRISPR-MGE pipeline identifies the CRISPRs from reads and contigs, and invas
- R version 3.4.0: packages `tidyverse`, `ggplot2`, `reshape2`, (...)
## 2- **MgePrediction workflow**
***Inputs**: from IMP results; MGMT co-assembled contigs, MT contigs
***Steps and outputs**:
- prediction of **phages** by [VirSorter](https://github.com/simroux/VirSorter) and [VirFinder](https://github.com/jessieren/VirFinder)
- prediction of **plasmids** by [cBar](http://csbl.bmb.uga.edu/~ffzhou/cBar/) and [PlasFlow](https://github.com/smaegol/PlasFlow)