Commit bd56726e authored by AntonieV's avatar AntonieV
Browse files

Changes according to PR#8

parent 90b21c30
# This file should contain everything to configure the workflow on a global scale. # This file should contains everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains # In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config/samples.tsv" samples: "config/samples.tsv"
...@@ -20,7 +20,7 @@ resources: ...@@ -20,7 +20,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: GRCh38 build: GRCh38
# for testing data a specific chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: chromosome:
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -53,7 +53,7 @@ params: ...@@ -53,7 +53,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -63,7 +63,7 @@ params: ...@@ -63,7 +63,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
...@@ -6,36 +6,35 @@ D E2 batch2 AJ ERa ...@@ -6,36 +6,35 @@ D E2 batch2 AJ ERa
E TNFa batch1 AK ERa E TNFa batch1 AK ERa
F TNFa batch2 AL ERa F TNFa batch2 AL ERa
G E2_TNFa batch1 AM ERa G E2_TNFa batch1 AM ERa
H E2_TNFa batch2 AN ERa H Veh batch1 AG p65
I Veh batch1 AG p65 I Veh batch2 AH p65
J Veh batch2 AH p65 J E2 batch1 AI p65
K E2 batch1 AI p65 K E2 batch2 AJ p65
L E2 batch2 AJ p65 L TNFa batch1 AK p65
M TNFa batch1 AK p65 M TNFa batch2 AL p65
N TNFa batch2 AL p65 N E2_TNFa batch1 AM p65
O E2_TNFa batch1 AM p65 O E2_TNFa batch2 AN p65
P E2_TNFa batch2 AN p65 P Veh batch1 AG FoxA1
Q Veh batch1 AG FoxA1 Q Veh batch2 AH FoxA1
R Veh batch2 AH FoxA1 R E2 batch1 AI FoxA1
S E2 batch1 AI FoxA1 S E2 batch2 AJ FoxA1
T E2 batch2 AJ FoxA1 T TNFa batch1 AK FoxA1
U TNFa batch1 AK FoxA1 U TNFa batch2 AL FoxA1
V TNFa batch2 AL FoxA1 V E2_TNFa batch1 AM FoxA1
W E2_TNFa batch1 AM FoxA1 W E2_TNFa batch2 AM FoxA1
X E2_TNFa batch2 AM FoxA1 X E2_TNFa batch1 AM ERa
Y E2_TNFa batch1 AM ERa Y E2_TNFa batch1 AM ERa
Z E2_TNFa batch1 AM ERa Z E2_TNFa batch1 AM ERa
AA E2_TNFa batch1 AM ERa AA E2_TNFa batch1 AM ERa
AB E2_TNFa batch1 AM ERa AB E2_TNFa batch2 AN ERa
AC E2_TNFa batch2 AN ERa AC E2_TNFa batch2 AN ERa
AD E2_TNFa batch2 AN ERa AD E2_TNFa batch2 AN ERa
AE E2_TNFa batch2 AN ERa AE E2_TNFa batch2 AN ERa
AF E2_TNFa batch2 AN ERa AF Veh batch1
AG Veh batch1 AG Veh batch2
AH Veh batch2 AH E2 batch1
AI E2 batch1 AI E2 batch2
AJ E2 batch2 AJ TNFa batch1
AK TNFa batch1 AK TNFa batch2
AL TNFa batch2 AL E2_TNFa batch1
AM E2_TNFa batch1 AM E2_TNFa batch2
AN E2_TNFa batch2
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
A 1 SRR1635443 ILLUMINA A 1 SRR1635443 ILLUMINA
B 1 SRR1635444 ILLUMINA B 1 SRR1635444 ILLUMINA
C 1 300 14 SRR1635445 ILLUMINA C 1 SRR1635445 ILLUMINA
D 1 SRR1635446 ILLUMINA D 1 SRR1635446 ILLUMINA
E 1 SRR1635447 ILLUMINA E 1 SRR1635447 ILLUMINA
F 1 SRR1635448 ILLUMINA F 1 SRR1635448 ILLUMINA
G 1 SRR1635449 ILLUMINA G 1 SRR1635449 ILLUMINA
H 1 SRR1635450 ILLUMINA G 2 SRR1635450 ILLUMINA
I 1 SRR1635451 ILLUMINA H 1 SRR1635451 ILLUMINA
J 1 SRR1635452 ILLUMINA I 1 SRR1635452 ILLUMINA
K 1 SRR1635453 ILLUMINA J 1 SRR1635453 ILLUMINA
L 1 SRR1635454 ILLUMINA K 1 SRR1635454 ILLUMINA
M 1 SRR1635455 ILLUMINA L 1 SRR1635455 ILLUMINA
N 1 SRR1635456 ILLUMINA M 1 SRR1635456 ILLUMINA
O 1 SRR1635457 ILLUMINA N 1 SRR1635457 ILLUMINA
P 1 SRR1635458 ILLUMINA O 1 SRR1635458 ILLUMINA
Q 1 SRR1635459 ILLUMINA P 1 SRR1635459 ILLUMINA
R 1 SRR1635460 ILLUMINA Q 1 SRR1635460 ILLUMINA
S 1 SRR1635461 ILLUMINA R 1 SRR1635461 ILLUMINA
T 1 SRR1635462 ILLUMINA S 1 SRR1635462 ILLUMINA
U 1 SRR1635463 ILLUMINA T 1 SRR1635463 ILLUMINA
V 1 SRR1635464 ILLUMINA U 1 SRR1635464 ILLUMINA
W 1 SRR1635465 ILLUMINA V 1 SRR1635465 ILLUMINA
X 1 SRR1635466 ILLUMINA W 1 SRR1635466 ILLUMINA
Y 1 SRR1635467 ILLUMINA X 1 SRR1635467 ILLUMINA
Z 1 SRR1635468 ILLUMINA Y 1 SRR1635468 ILLUMINA
AA 1 SRR1635469 ILLUMINA Z 1 SRR1635469 ILLUMINA
AB 1 SRR1635470 ILLUMINA AA 1 SRR1635470 ILLUMINA
AC 1 SRR1635471 ILLUMINA AB 1 SRR1635471 ILLUMINA
AD 1 SRR1635472 ILLUMINA AC 1 SRR1635472 ILLUMINA
AE 1 SRR1635473 ILLUMINA AD 1 SRR1635473 ILLUMINA
AF 1 SRR1635474 ILLUMINA AE 1 SRR1635474 ILLUMINA
AG 1 SRR1635435 ILLUMINA AF 1 SRR1635435 ILLUMINA
AH 1 SRR1635436 ILLUMINA AG 1 SRR1635436 ILLUMINA
AI 1 SRR1635437 ILLUMINA AH 1 SRR1635437 ILLUMINA
AJ 1 SRR1635438 ILLUMINA AI 1 SRR1635438 ILLUMINA
AK 1 SRR1635439 ILLUMINA AJ 1 SRR1635439 ILLUMINA
AL 1 SRR1635440 ILLUMINA AK 1 SRR1635440 ILLUMINA
AM 1 SRR1635441 ILLUMINA AL 1 SRR1635441 ILLUMINA
AN 1 SRR1635442 ILLUMINA AM 1 SRR1635442 ILLUMINA
# This file should contain everything to configure the workflow on a global scale. # This file should contains everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains # In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config_paired_end_reduced/samples.tsv" samples: "config_paired_end_reduced/samples.tsv"
...@@ -17,7 +17,7 @@ resources: ...@@ -17,7 +17,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: R64-1-1 build: R64-1-1
# for testing data a single chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: chromosome:
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -50,7 +50,7 @@ params: ...@@ -50,7 +50,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -60,7 +60,7 @@ params: ...@@ -60,7 +60,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
A 1 data/atacseq/test-datasets/testdata/SRR1822153_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822153_2.fastq.gz ILLUMINA A 1 data/atacseq/test-datasets/testdata/SRR1822153_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822153_2.fastq.gz ILLUMINA
B 1 data/atacseq/test-datasets/testdata/SRR1822154_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822154_2.fastq.gz ILLUMINA B 1 data/atacseq/test-datasets/testdata/SRR1822154_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822154_2.fastq.gz ILLUMINA
C 1 300 14 data/atacseq/test-datasets/testdata/SRR1822157_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822157_2.fastq.gz ILLUMINA C 1 data/atacseq/test-datasets/testdata/SRR1822157_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822157_2.fastq.gz ILLUMINA
D 1 data/atacseq/test-datasets/testdata/SRR1822158_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822158_2.fastq.gz ILLUMINA D 1 data/atacseq/test-datasets/testdata/SRR1822158_1.fastq.gz data/atacseq/test-datasets/testdata/SRR1822158_2.fastq.gz ILLUMINA
E 1 data/chipseq/test-datasets/testdata/SRR5204809_Spt5-ChIP_Input1_SacCer_ChIP-Seq_ss100k_R1.fastq.gz data/chipseq/test-datasets/testdata/SRR5204809_Spt5-ChIP_Input1_SacCer_ChIP-Seq_ss100k_R2.fastq.gz ILLUMINA E 1 data/chipseq/test-datasets/testdata/SRR5204809_Spt5-ChIP_Input1_SacCer_ChIP-Seq_ss100k_R1.fastq.gz data/chipseq/test-datasets/testdata/SRR5204809_Spt5-ChIP_Input1_SacCer_ChIP-Seq_ss100k_R2.fastq.gz ILLUMINA
F 1 data/chipseq/test-datasets/testdata/SRR5204810_Spt5-ChIP_Input2_SacCer_ChIP-Seq_ss100k_R1.fastq.gz data/chipseq/test-datasets/testdata/SRR5204810_Spt5-ChIP_Input2_SacCer_ChIP-Seq_ss100k_R2.fastq.gz ILLUMINA F 1 data/chipseq/test-datasets/testdata/SRR5204810_Spt5-ChIP_Input2_SacCer_ChIP-Seq_ss100k_R1.fastq.gz data/chipseq/test-datasets/testdata/SRR5204810_Spt5-ChIP_Input2_SacCer_ChIP-Seq_ss100k_R2.fastq.gz ILLUMINA
# This file should contain everything to configure the workflow on a global scale. # This file should contains everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains # In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config_paired_end_reduced/samples.tsv" samples: "config_paired_end_reduced/samples.tsv"
...@@ -17,7 +17,7 @@ resources: ...@@ -17,7 +17,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: R64-1-1 build: R64-1-1
# for testing data a specific chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: VII chromosome: VII
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -50,7 +50,7 @@ params: ...@@ -50,7 +50,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -60,7 +60,7 @@ params: ...@@ -60,7 +60,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
A 1 data/paired_end_test_data/A-1_vii_1.fastq.gz data/paired_end_test_data/A-1_vii_2.fastq.gz ILLUMINA A 1 data/paired_end_test_data/A-1_vii_1.fastq.gz data/paired_end_test_data/A-1_vii_2.fastq.gz ILLUMINA
B 1 data/paired_end_test_data/B-1_vii_1.fastq.gz data/paired_end_test_data/B-1_vii_2.fastq.gz ILLUMINA B 1 data/paired_end_test_data/B-1_vii_1.fastq.gz data/paired_end_test_data/B-1_vii_2.fastq.gz ILLUMINA
C 1 300 14 data/paired_end_test_data/C-1_vii_1.fastq.gz data/paired_end_test_data/C-1_vii_2.fastq.gz ILLUMINA C 1 data/paired_end_test_data/C-1_vii_1.fastq.gz data/paired_end_test_data/C-1_vii_2.fastq.gz ILLUMINA
D 1 data/paired_end_test_data/D-1_vii_1.fastq.gz data/paired_end_test_data/D-1_vii_2.fastq.gz ILLUMINA D 1 data/paired_end_test_data/D-1_vii_1.fastq.gz data/paired_end_test_data/D-1_vii_2.fastq.gz ILLUMINA
E 1 data/paired_end_test_data/E-1_vii_1.fastq.gz data/paired_end_test_data/E-1_vii_2.fastq.gz ILLUMINA E 1 data/paired_end_test_data/E-1_vii_1.fastq.gz data/paired_end_test_data/E-1_vii_2.fastq.gz ILLUMINA
# This file should contain everything to configure the workflow on a global scale. # This file should contains everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains # In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config_single_end/samples.tsv" samples: "config_single_end/samples.tsv"
...@@ -20,7 +20,7 @@ resources: ...@@ -20,7 +20,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: GRCh38 build: GRCh38
# for testing data a specific chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: chromosome:
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -53,7 +53,7 @@ params: ...@@ -53,7 +53,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -63,7 +63,7 @@ params: ...@@ -63,7 +63,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
A 1 SRR1635455 ILLUMINA A 1 SRR1635455 ILLUMINA
B 1 SRR1635456 ILLUMINA B 1 SRR1635456 ILLUMINA
C 1 SRR1635457 ILLUMINA C 1 SRR1635457 ILLUMINA
C 2 SRR1635458 ILLUMINA C 2 SRR1635458 ILLUMINA
D 1 SRR1635439 ILLUMINA D 1 SRR1635439 ILLUMINA
E 1 SRR1635441 ILLUMINA E 1 SRR1635441 ILLUMINA
# This file should contain everything to configure the workflow on a global scale. # This file should contains everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains # In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config_single_end_reduced/samples.tsv" samples: "config_single_end_reduced/samples.tsv"
...@@ -20,7 +20,7 @@ resources: ...@@ -20,7 +20,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: GRCh38 build: GRCh38
# for testing data a specific chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: 21 chromosome: 21
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -53,7 +53,7 @@ params: ...@@ -53,7 +53,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -63,7 +63,7 @@ params: ...@@ -63,7 +63,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
A 1 data/single_end_test_data/A-1_chr21.fastq.gz ILLUMINA A 1 data/single_end_test_data/A-1_chr21.fastq.gz ILLUMINA
B 1 data/single_end_test_data/B-1_chr21.fastq.gz ILLUMINA B 1 data/single_end_test_data/B-1_chr21.fastq.gz ILLUMINA
C 1 data/single_end_test_data/C-1_chr21.fastq.gz ILLUMINA C 1 data/single_end_test_data/C-1_chr21.fastq.gz ILLUMINA
C 2 data/single_end_test_data/C-2_chr21.fastq.gz ILLUMINA C 2 data/single_end_test_data/C-2_chr21.fastq.gz ILLUMINA
D 1 data/single_end_test_data/D-1_chr21.fastq.gz ILLUMINA D 1 data/single_end_test_data/D-1_chr21.fastq.gz ILLUMINA
E 1 data/single_end_test_data/E-1_chr21.fastq.gz ILLUMINA E 1 data/single_end_test_data/E-1_chr21.fastq.gz ILLUMINA
...@@ -5,7 +5,7 @@ To configure this workflow, modify ``config/config.yaml`` according to your need ...@@ -5,7 +5,7 @@ To configure this workflow, modify ``config/config.yaml`` according to your need
# Sample sheet # Sample sheet
Add samples to `config/samples.tsv`. For each sample, the columns `sample`, `group`, `control`, and `antibody` have to be defined. Add samples to `config/samples.tsv`. For each sample, the columns `sample`, `group`, `control`, and `antibody` have to be defined.
* Samples / IP (immunoprecipitations) within the same `group` represents replicates and must have the same antibody and the same control. * Samples / IP (immunoprecipitations) within the same `group` represent replicates and must have the same antibody and the same control.
* Controls / Input are listed like samples, but they do not have entries in the columns for `control` and `antibody`. * Controls / Input are listed like samples, but they do not have entries in the columns for `control` and `antibody`.
* The identifiers of each control has to be noted in the column `sample`. * The identifiers of each control has to be noted in the column `sample`.
* For all samples, the identifiers of the corresponding controls have to be given in the `control` column (see example below). * For all samples, the identifiers of the corresponding controls have to be given in the `control` column (see example below).
......
# This file contain everything to configure the workflow on a global scale. # This file contains everything to configure the workflow on a global scale.
# The sample based data must be complemented by a samples.tsv file that contains # The sample based data must be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas. # one row per sample. It can be parsed easily via pandas.
samples: "config/samples.tsv" samples: "config/samples.tsv"
# to download reads from SRA the accession numbers (see https://www.ncbi.nlm.nih.gov/sra) of samples must be given in units.tsv # The source of fastq files for every sequencing unit of all samples has to be provided in the units.tsv file.
units: "config/units.tsv" units: "config/units.tsv"
single_end: False single_end: False
...@@ -17,7 +17,7 @@ resources: ...@@ -17,7 +17,7 @@ resources:
release: 101 release: 101
# Genome build # Genome build
build: R64-1-1 build: R64-1-1
# for testing data a specific chromosome can be selected # for testing data a single chromosome can be selected (leave empty for a regular analysis)
chromosome: chromosome:
# specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2 # specify release version number of igenomes list to use (see https://github.com/nf-core/chipseq/releases), e.g. 1.2.2
igenomes_release: 1.2.2 igenomes_release: 1.2.2
...@@ -50,7 +50,7 @@ params: ...@@ -50,7 +50,7 @@ params:
activate: True activate: True
consensus-peak-analysis: consensus-peak-analysis:
activate: True activate: True
# samtools view parameters: # samtools view parameter suggestions (for full parameters, see: https://www.htslib.org/doc/samtools-view.html):
# if duplicates should be removed in this filtering, add "-F 0x0400" to the params # if duplicates should be removed in this filtering, add "-F 0x0400" to the params
# if for each read, you only want to retain a single (best) mapping, add "-q 1" to params # if for each read, you only want to retain a single (best) mapping, add "-q 1" to params
# if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions), # if you would like to restrict analysis to certain regions (e.g. excluding other "blacklisted" regions),
...@@ -60,7 +60,7 @@ params: ...@@ -60,7 +60,7 @@ params:
samtools-view-se: "-b -F 0x004" samtools-view-se: "-b -F 0x004"
samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001" samtools-view-pe: "-b -F 0x004 -G 0x009 -f 0x001"
plotfingerprint: plotfingerprint:
# Number of bins that sampled from the genome, for which the overlapping number of reads is computed for fingerprint plot # --numberOfSamples parameter of deeptools plotFingerprint, see: https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html#Optional%20arguments
number-of-samples: 500000 number-of-samples: 500000
# optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step # optional parameters for picard's CollectMultipleMetrics from sorted, filtered and merged bam files in post analysis step
# see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard- # see https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard-
......
sample unit fragment_len_mean fragment_len_sd fq1 fq2 sra_accession platform sample unit fq1 fq2 sra_accession platform
**MultiQC** report is a collection of multiple plots and stats from Chip-seq processing pipeline (phantompeakqualtools), The **MultiQC** report is a collection of multiple plots, stats and metrics from phantompeakqualtools (Chip-seq processing), Preseq, Picard, Samtools and FastQC.
Preseq analysis, metrics from Picard and Samtools. There are also quality control metrics and plots from FastQC analysis. For detailed descriptions of the individual plots and statistics, load the MultiQC report by clicking on it.
Detailed descriptions of the individual plots and statistics can be found in MultiQC report.
`Homer annotatePeaks <http://homer.ucsd.edu/homer/ngs/annotation.html>`_ is used to annotate the peaks relative `Homer annotatePeaks <http://homer.ucsd.edu/homer/ngs/annotation.html>`_ assigns known genomic features to peaks.
to known genomic features. The plots of this analysis show for all samples and their associated controls peak location For each sample-control pair, plots show the peak locations relative to annotated features, the percentage of unique genes to closest peak and the peak distribution relative to TSS (Transcription Start Site).
relative to annotation, percentage of unique genes to closest peak an peak distribution relative to TSS (Transcription Start Site).
`Base distribution by cycle plot `Plot of base distribution per sequencing cycle
<https://gatk.broadinstitute.org/hc/en-us/articles/360042477312-CollectBaseDistributionByCycle-Picard->`_ **(Picard)** is <https://gatk.broadinstitute.org/hc/en-us/articles/360042477312-CollectBaseDistributionByCycle-Picard->`_.
used as quality control for alignment-level and shows the nucleotide distribution per cycle of the bam files after This **Picard** tool shows the nucleotide distribution per sequencing cycle of the bam files after filtering, sorting, merging and removing orphans.
filtering, sorting, merging and removing orphans. For any cycle within reads the relative proportions of nucleotides For any sequencing cycle within reads, the relative proportions of nucleotides should reflect the AT:CG content.
should reflect the AT:CG content. For all nucleotides flattish lines would be expected and any spikes would suggest a For all nucleotides, flat lines would be expected and any spikes suggest a systematic sequencing error.
systematic sequencing error. For more information about `collected Picard metrics For more information about `collected Picard metrics
<https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard->`_ please <https://gatk.broadinstitute.org/hc/en-us/articles/360037594031-CollectMultipleMetrics-Picard->`_ please
see `documentation <https://broadinstitute.github.io/picard/>`_. see the `documentation <https://broadinstitute.github.io/picard/>`_.
**MACS2 and bedtools** merged consensus {{ snakemake.wildcards.peak }} peaks plot is generated by calculating the **MACS2 and bedtools** merged consensus {{ snakemake.wildcards.peak }} peaks plot is generated by calculating the
proportion of intersection size assigned to {{ snakemake.wildcards.samples }} for {{ snakemake.wildcards.antibody }}. proportion of intersection size assigned to all samples for {{ snakemake.wildcards.antibody }}.
`MA plot <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf#Rfn.plotMA>`_ **(FDR 0.01)** The `MA plot <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#ma-plot>`_ **(FDR 0.01)**
shows the log2 fold changes versus the mean of normalized counts from displays the log2 fold changes versus the mean of normalized counts of the
`DESeq2 <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf>`_ analysis filtered on a false DESeq2 analysis results.
discovery rate (FDR) threshold of 0.01 for pairwise comparisons of samples across the groups from a particular antibody. The results of this plot are filtered on a false discovery rate (FDR) threshold of 0.01 and represent the comparison of the
For more information about DESeq2 please see {{snakemake.wildcards["group_1"]}} versus {{snakemake.wildcards["group_2"]}} groups treated with the
{{snakemake.wildcards["antibody"]}} antibody. For more information about DESeq2 please see
`documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_. `documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
**Volcano plot (FDR 0.01)** shows the significance (adjusted p-value) versus the log2 fold changes of the results of the **Volcano plot (FDR 0.01)** shows the significance (adjusted p-value) versus the log2 fold changes of the
`DESeq2 <https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf>`_ analysis filtered on a false DESeq2 analysis results.
discovery rate (FDR) threshold of 0.01 for pairwise comparisons of samples across the groups from a particular antibody. The results of this plot are filtered on a false discovery rate (FDR) threshold of 0.01 and represent the comparison of the
For more information about DESeq2 please see {{snakemake.wildcards["group_1"]}} versus {{snakemake.wildcards["group_2"]}} groups for the
{{snakemake.wildcards["antibody"]}} antibody. For more information about DESeq2 please see
`documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_. `documentation <https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html>`_.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment