Commit 73a28a45 authored by Sarah Peter's avatar Sarah Peter

Add exercise to plot heatmap with deeptools

parent c9821d9b
......@@ -55,7 +55,7 @@
* Be aware that `bedtools` writes the output to the command-line (stdout), so you have to manually redirect it to a file with `>`.
* (Optional) Since our peaks are expected to be at transcription start sites (TSS) and not necessarily within the gene body, you might want to extend the regions in the gene BED file with `bedtools slop` to cover also the TSS.
* (Optional) Since our peaks are expected to be at transcription start sites (TSS) and not necessarily within the gene body, you might want to extend the regions in the gene BED file with `bedtools slop` to cover a few kb upstream of the gene start.
```bash
$ bedtools slop -l 2000 -r 0 -s -i reference/UCSC_mm10_refseq.chromosome.12.bed -g reference/Mus_musculus.GRCm38.dna_sm.chromosome.12.fa.fai
......@@ -65,3 +65,24 @@
* (Optional) Extract the list of unique genes from the `bedtools intersect` output (column 14) using `cut`, `sort` and `uniq`.
2. Add a rule that draws a heatmap of the ChIP signal around the genes using [deepTools](https://deeptools.readthedocs.io/en/latest/index.html).
* The toolsuite `deepTools` is available from the `bioconda` conda channel.
* You will need the `computeMatrix` and `plotHeatmap` tools.
* An example command-line to generate a plot is
```bash
$ computeMatrix scale-regions -S output/TC1-ST2-D0.12_treat_pileup.bigwig -R reference/UCSC_mm10_refseq.chromosome.12.bed -b 3000 -m 5000 -a 1000 -o deeptools/TC1-ST2-D0.12_matrix.gz
$ plotHeatmap -m deeptools/TC1-ST2-D0.12_matrix.gz -o output/TC1-ST2-D0.12_deeptools_plot.png
```
* (Optional) Generate a properly normalised bigwig track first, using `bamCompare`, and use this as input for `computeMatrix`.
```bash
$ bamCompare -b1 bowtie2/H3K4-TC1-ST2-D0.12.bam -b2 bowtie2/INPUT-TC1-ST2-D0.12.bam -o deeptools/TC1-ST2-D0.12_normalised.bigwig -of bigwig --effectiveGenomeSize 120129022
```
* You can put all commands in a single rule or create one rule per command. Since we do not reuse any of the intermediate outputs, a single rule works fine, but otherwise it is better to split up the rule (e.g. if you want to create multiple plots from the bigwig or matrix file).
SAMPLE = "TC1-ST2-D0.12"
rule all:
input: f"output/{SAMPLE}_peaks.narrowPeak", f"output/{SAMPLE}_control_lambda.bigwig", f"output/{SAMPLE}_treat_pileup.bigwig", f"output/H3K4-{SAMPLE}_fastqc.html", f"output/INPUT-{SAMPLE}_fastqc.html", f"output/alignment_metrics_H3K4-{SAMPLE}.txt", f"output/alignment_metrics_INPUT-{SAMPLE}.txt", f"output/{SAMPLE}_peaks_intersect_genes.tsv"
input: f"output/{SAMPLE}_peaks.narrowPeak", f"output/{SAMPLE}_control_lambda.bigwig", f"output/{SAMPLE}_treat_pileup.bigwig", f"output/H3K4-{SAMPLE}_fastqc.html", f"output/INPUT-{SAMPLE}_fastqc.html", f"output/alignment_metrics_H3K4-{SAMPLE}.txt", f"output/alignment_metrics_INPUT-{SAMPLE}.txt", f"output/{SAMPLE}_peaks_intersect_genes.tsv", f"output/{SAMPLE}_deeptools_plot.png"
rule mapping:
input: "chip-seq/{sample}.fastq.gz"
......@@ -93,3 +93,21 @@ rule intersect:
bedtools intersect -wb -a {input.peaks} -b {input.genes} > {output.full}
cut -f 14 {output.full} | sort | uniq > {output.genes}
"""
rule plotSignal:
input:
control = "bowtie2/INPUT-{sample}.bam",
chip = "bowtie2/H3K4-{sample}.bam"
output:
bigwig = "deeptools/{sample}_normalised.bigwig",
matrix = "deeptools/{sample}_matrix.gz",
plot = "output/{sample}_deeptools_plot.png"
params:
regions = "reference/UCSC_mm10_refseq.chromosome.12.bed"
conda: "envs/deeptools.yaml"
shell:
"""
bamCompare -b1 {input.chip} -b2 {input.control} -o {output.bigwig} -of bigwig --effectiveGenomeSize 120129022
computeMatrix scale-regions -S {output.bigwig} -R {params.regions} -b 3000 -m 5000 -a 1000 -o {output.matrix}
plotHeatmap -m {output.matrix} -o {output.plot}"
"""
name: deeptools
channels:
- bioconda
dependencies:
- deeptools
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment