Commit 04df6e65 authored by Sarah Peter's avatar Sarah Peter

Merge branch 'exercises' of...

Merge branch 'exercises' of ssh://git-r3lab-server.uni.lu:8022/R3/school/snakemake-tutorial into exercises

* 'exercises' of ssh://git-r3lab-server.uni.lu:8022/R3/school/snakemake-tutorial:
  Added exercise. Conversion of a shell script into a Snakefile
parents 8bacad6c 3b96fcbc
......@@ -86,3 +86,36 @@
* You can put all commands in a single rule or create one rule per command. Since we do not reuse any of the intermediate outputs, a single rule works fine, but otherwise it would be better to split up the rule.
## Shell script into Snakefile
1. Have a look at the following shell script.
The script is adapted from the first few steps of the Galaxy atac-seq tutorial (https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html)
```bash
#!/bin/sh
wget https://zenodo.org/record/3270536/files/SRR891268_R1.fastq.gz
wget https://zenodo.org/record/3270536/files/SRR891268_R2.fastq.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/chromosomes/chr22.fa.gz
# Build the bowtie index of the reference for the mapping
bowtie2-build chr22.fa.gz hg38_chr22
# Mapping
bowtie2 -x hg38_chr22 -1 SRR891268_R1.fastq.gz -2 SRR891268_R2.fastq.gz \
-I 0 -X 500 --fr --dovetail --very-sensitive | \
samtools sort -n - > SRR891268.bam
# Peak Calling
Genrich -t SRR891268.bam -o SRR891268.narrowPeak \
-e "chrM" -f SRR891268.log -m 0 -j -d 100\
-q 0.05 -a 20 -l 0 -g 100 -v \
-k SRR891268.bg
```
* Write the script into a Snakefile.
* The peak caller `Genrich` is available from the `bioconda` conda channel
2. How can you improve the workflow in the context of data management? Think about data structures and log files.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment