Commit 3b96fcbc authored by Nikola de Lange's avatar Nikola de Lange

Added exercise. Conversion of a shell script into a Snakefile

parent 73a28a45
......@@ -86,3 +86,36 @@
* You can put all commands in a single rule or create one rule per command. Since we do not reuse any of the intermediate outputs, a single rule works fine, but otherwise it is better to split up the rule (e.g. if you want to create multiple plots from the bigwig or matrix file).
## Shell script into Snakefile
1. Have a look at the following shell script.
The script is adapted from the first few steps of the Galaxy atac-seq tutorial (https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html)
```bash
#!/bin/sh
wget https://zenodo.org/record/3270536/files/SRR891268_R1.fastq.gz
wget https://zenodo.org/record/3270536/files/SRR891268_R2.fastq.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/chromosomes/chr22.fa.gz
# Build the bowtie index of the reference for the mapping
bowtie2-build chr22.fa.gz hg38_chr22
# Mapping
bowtie2 -x hg38_chr22 -1 SRR891268_R1.fastq.gz -2 SRR891268_R2.fastq.gz \
-I 0 -X 500 --fr --dovetail --very-sensitive | \
samtools sort -n - > SRR891268.bam
# Peak Calling
Genrich -t SRR891268.bam -o SRR891268.narrowPeak \
-e "chrM" -f SRR891268.log -m 0 -j -d 100\
-q 0.05 -a 20 -l 0 -g 100 -v \
-k SRR891268.bg
```
* Write the script into a Snakefile.
* The peak caller `Genrich` is available from the `bioconda` conda channel
2. How can you improve the workflow in the context of data management? Think about data structures and log files.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment