Commit ef5f66c8 authored by AntonieV's avatar AntonieV
Browse files

some minor changes on report descriptions

parent 26835831
**MultiQC** report is a collection of multiple plots and stats from Chip-seq processing pipeline (phantompeakqualtools),
Preseq analysis, metrics from Picard and Samtools. There are also quality control metrics and plots from FastQC analysis.
Detailed descriptions of the individual plots and statistics can be found in MultiQC report.
**`Homer annotatePeaks <http://homer.ucsd.edu/homer/ngs/annotation.html>`_ ** is used to annotate the peaks relative
`Homer annotatePeaks <http://homer.ucsd.edu/homer/ngs/annotation.html>`_ is used to annotate the peaks relative
to known genomic features. The plots of this analysis show for all samples and their associated controls peak location
relative to annotation, percentage of unique genes to closest peak an peak distribution relative to TSS
(Transcription Start Site).
relative to annotation, percentage of unique genes to closest peak an peak distribution relative to TSS (Transcription Start Site).
**`Base distribution by cycle plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360042477312-CollectBaseDistributionByCycle-Picard->`_ (Picard)** is
`Base distribution by cycle plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360042477312-CollectBaseDistributionByCycle-Picard->`_ **(Picard)** is
used as quality control for alignment-level and shows the nucleotide distribution per cycle of the bam files after
filtering, sorting, merging and removing orphans. For any cycle within reads the relative proportions of nucleotides
should reflect the AT:CG content. For all nucleotides flattish lines would be expected and any spikes would suggest a
......
**`MA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotMA>`_ (FDR 0.01)**
`MA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotMA>`_ **(FDR 0.01)**
shows the log2 fold changes versus the mean of normalized counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf>`_ analysis filtered on a false
discovery rate (FDR) threshold of 0.01 for pairwise comparisons of samples across the groups from a particular antibody.
......
**`MA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotMA>`_ (FDR 0.05)**
`MA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotMA>`_ **(FDR 0.05)**
shows the log2 fold changes versus the mean of normalized counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf>`_ analysis filtered on a false
discovery rate (FDR) threshold of 0.05 for pairwise comparisons of samples across the groups from a particular antibody.
......
**`Heatmap plot <https://cran.r-project.org/web/packages/pheatmap/pheatmap.pdf>`_** shows for each antibody a heatmap of
`Heatmap plot <https://cran.r-project.org/web/packages/pheatmap/pheatmap.pdf>`_ shows for each antibody a heatmap of
the `rlog <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.rlog>`_ or
`vst <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.vst>`_ transformed counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_ analysis. For more
information about DESeq2 please see
DESeq2 analysis. For more information about DESeq2 please see
`documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
**`PCA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotPCA>`_
(Principal Component Analysis)** describes variance of the
`PCA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotPCA>`_
**(Principal Component Analysis)** describes variance of the
`rlog <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.rlog>`_ or
`vst <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.vst>`_ transformed counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_ analysis with regard to
the antibody used. For more information about DESeq2 please see
DESeq2 analysis with regard to the antibody used. For more information about DESeq2 please see
`documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
**Correlation `heatmap plots <https://cran.r-project.org/web/packages/pheatmap/pheatmap.pdf>`_** shows heatmaps of the
Correlation `heatmap plots <https://cran.r-project.org/web/packages/pheatmap/pheatmap.pdf>`_ shows heatmaps of the
`rlog <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.rlog>`_ or
`vst <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.vst>`_ transformed counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_ analysis for each
pairwise comparison of samples across the groups from a particular antibody. For more information about DESeq2 please
DESeq2 analysis for each pairwise comparison of samples across the groups from a particular antibody. For more information about DESeq2 please
see `documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
**Scatter plots** uses the matrix of the
`rlog <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.rlog>`_ or
`vst <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.vst>`_ transformed counts from
`DESeq2 <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_ analysis for
pairwise comparisons of samples across the groups from a particular antibody. For more information about DESeq2 please see
DESeq2 analysis for pairwise comparisons of samples across the groups from a particular antibody. For more information about DESeq2 please see
`documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
**deepTools `plotFingerprint <https://deeptools.readthedocs.io/en/latest/content/tools/plotFingerprint.html>`_** is a
**deepTools** `plotFingerprint <https://deeptools.readthedocs.io/en/latest/content/tools/plotFingerprint.html>`_ is a
quality control tool, which determines how well the signal in the ChIP-seq sample can be differentiated from the
background distribution of reads in the control sample. plotFingerprint randomly samples genome regions of a specified
length and sums the per-base coverage that overlap with those regions. These values are then sorted according to their
rank and the cumulative sum of read counts is plotted. With a perfect uniform distribution of reads along the genome and
infinite sequencing coverage a straight diagonal line is shown in the plot. For more information on the interpretation of
the plots please see `here <https://deeptools.readthedocs.io/en/latest/content/tools/plotFingerprint.html#what-the-plots-tell-you>`_.
the plots please see `documentation <https://deeptools.readthedocs.io/en/latest/content/tools/plotFingerprint.html#what-the-plots-tell-you>`_.
For more information about plotFingerprint metrics see
`here <https://deeptools.readthedocs.io/en/latest/content/feature/plotFingerprint_QC_metrics.html>`_.
**deepTools `plotHeatmap <https://deeptools.readthedocs.io/en/develop/content/tools/plotHeatmap.html>`_** creates a
**deepTools** `plotHeatmap <https://deeptools.readthedocs.io/en/develop/content/tools/plotHeatmap.html>`_ creates a
heatmap for scores over sets of genomic regions and gives a visualisation for the genome-wide enrichment of the samples.
The `scores <https://deeptools.readthedocs.io/en/develop/content/tools/computeMatrix.html>`_ represent the signal over a
set of regions where all regions are scaled to the same size.
**`Insert size histogram
<https://gatk.broadinstitute.org/hc/en-us/articles/360037055772-CollectInsertSizeMetrics-Picard->`_ (Picard)** is used
`Insert size histogram
<https://gatk.broadinstitute.org/hc/en-us/articles/360037055772-CollectInsertSizeMetrics-Picard->`_ **(Picard)** is used
for validating paired-end library construction. It shows the insert size distribution versus fractions of read pairs in
each of the three orientations (FR, RF, and TANDEM) as a histogram. For more information about `collected Picard metrics
<https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard->`_ please
......
**`MACS2 <https://github.com/macs3-project/MACS/blob/master/README.md>`_ callpeak quality control plots** show for all
`MACS2 <https://github.com/macs3-project/MACS/blob/master/README.md>`_ **callpeak quality control plots** show for all
samples and their associated controls the number of peaks and their distributions of peak length, fold-change, FDR and
p-value from the results of the
`MACS callpeak analysis <https://hbctraining.github.io/Intro-to-ChIPseq/lessons/05_peak_calling_macs.html>`_.
......
**`Phantompeakqualtools plot <https://code.google.com/archive/p/phantompeakqualtools/>`_** shows strand-shift versus
`Phantompeakqualtools plot <https://code.google.com/archive/p/phantompeakqualtools/>`_ shows strand-shift versus
cross-correlation and computes informative enrichment and quality measures for ChIP-seq data. It also calculates the
relative (RSC) and the normalized strand cross-correlation coefficient (NSC). Datasets with NSC values much less than
1.1 tend to have low signal to noise or few peaks. This may indicate poor quality or only few binding sites. RSC values
significantly lower than 1 (< 0.8) tend to have low signal to noise. The low scores can be due to several factors and
are often due to failed and poor quality ChIP or low read sequence quality. In section "ChIP-seq processing pipeline"
of MultiQC report are additional plots across all samples for RSC, NSC and strand-shift cross-correlation.
For more information about Phantompeakqualtools please
For more information about Phantompeakqualtools please
see `documentation <https://github.com/kundajelab/phantompeakqualtools/blob/master/README.md>`_. For more information
about interpretation see published `article <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/>`_.
**deepTools `plotProfile <https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html>`_** creates a
**deepTools** `plotProfile <https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html>`_ creates a
profile plot for scores over sets of genomic regions and gives a visualisation for the genome-wide enrichment of the
samples. The `scores <https://deeptools.readthedocs.io/en/develop/content/tools/computeMatrix.html>`_ represent the
signal over a set of regions where all regions are scaled to the same size.
**`Mean quality by cycle plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360040506831-MeanQualityByCycle-Picard->`_ (Picard)** is used as
`Mean quality by cycle plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360040506831-MeanQualityByCycle-Picard->`_ **(Picard)** is used as
quality control for sequencing machine performance and collects the mean quality by cycle of the bam files after
filtering, sorting, merging and removing orphans. Any spikes in quality within reads may indicate technical problems
during sequencing. For more information about `collected Picard metrics
......
**`Quality score distribution plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360037057312-QualityScoreDistribution-Picard->`_ (Picard)** is used
`Quality score distribution plot
<https://gatk.broadinstitute.org/hc/en-us/articles/360037057312-QualityScoreDistribution-Picard->`_ **(Picard)** is used
as overall quality control for a library in a given run. It shows for the range of quality scores the corresponding
total numbers of bases. For more information about `collected Picard metrics
<https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard->`_ please
......
......@@ -196,22 +196,22 @@ rule featurecounts_deseq2:
directory("results/deseq2/comparison_plots/MA_plots/FDR_0.01_{antibody}consensus_{peak}-peaks"),
patterns=["{antibody}.{{group_1_vs_group_2}}.MA-plot_FDR_0.01.pdf"],
caption = "../report/plot_deseq2_FDR_1_perc_MA.rst",
category = "DESeq2-FDR"),
category = "DESeq2"),
plot_FDR_5_perc_MA=report(
directory("results/deseq2/comparison_plots/MA_plots/FDR_0.05_{antibody}consensus_{peak}-peaks"),
patterns=["{antibody}.{{group_1_vs_group_2}.MA-plot_FDR_0.05.pdf"],
caption = "../report/plot_deseq2_FDR_5_perc_MA.rst",
category = "DESeq2-FDR"),
category = "DESeq2"),
plot_FDR_1_perc_volcano=report(
directory("results/deseq2/comparison_plots/volcano_plots/FDR_0.01_{antibody}consensus_{peak}-peaks"),
patterns=["{antibody}.{{group_1_vs_group_2}}.volcano-plot_FDR_0.01.pdf"],
caption = "../report/plot_deseq2_FDR_1_perc_volcano.rst",
category = "DESeq2-FDR"),
category = "DESeq2"),
plot_FDR_5_perc_volcano=report(
directory("results/deseq2/comparison_plots/volcano_plots/FDR_0.05_{antibody}consensus_{peak}-peaks"),
patterns=["{antibody}.{{group_1_vs_group_2}}.volcano-plot_FDR_0.05.pdf"],
caption = "../report/plot_deseq2_FDR_5_perc_volcano.rst",
category = "DESeq2-FDR"),
category = "DESeq2"),
plot_sample_corr_heatmap=report(
directory("results/deseq2/comparison_plots/correlation_heatmaps_{antibody}consensus_{peak}-peaks"),
patterns=["{antibody}.{{group_1_vs_group_2}}.correlation_heatmap.pdf"],
......
......@@ -16,7 +16,7 @@ rule multiqc:
input:
get_multiqc_input
output:
report("results/qc/multiqc/multiqc.html", category="MultiQC report")
report("results/qc/multiqc/multiqc.html", caption="../report/multiqc_report.rst", category="QC")
log:
"logs/multiqc.log"
wrapper:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment