Commit 642aed7b authored by Sarah Peter's avatar Sarah Peter

Starting a new page for exercises

parent d437b308
# Exercises
## Quality control
1. Add a rule that generates [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) reports for the two input fastq.gz files.
* The tool `fastqc` is available from the `bioconda` conda channel.
* An example command-line to use `fastqc` is
```
$ fastqc chip-seq/H3K4-TC1-ST2-D0.12.fastq.gz
```
* The output are two files in the same directory as the input file, `H3K4-TC1-ST2-D0.12_fastqc.html` and `H3K4-TC1-ST2-D0.12_fastqc.zip`.
* You might want to copy at least the html file to the `output` directory.
* Do not forget to add at least one of the output files to the summary rule or explicitly specify it on the command-line.
2. Add a rule that generates alignment statistics for the two bam files with [Picard](https://broadinstitute.github.io/picard/) [CollectAlignmentSummaryMetrics](https://broadinstitute.github.io/picard/command-line-overview.html#CollectAlignmentSummaryMetrics).
* The tool `picard` is available from the `bioconda` conda channel.
* An example command-line to run `picard` is
```
$ picard CollectAlignmentSummaryMetrics R=reference/Mus_musculus.GRCm38.dna_sm.chromosome.12.fa I=bowtie2/H3K4-TC1-ST2-D0.12.bam O=output/alignment_metrics_H3K4-TC1-ST2-D0.txt
```
* You can use the `params` directive again to specify the reference file.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment