IMP issueshttps://git-r3lab.uni.lu/groups/IMP/-/issues2017-12-19T09:20:54+01:00https://git-r3lab.uni.lu/IMP/IMP/-/issues/13Handling multiple fastq files2017-12-19T09:20:54+01:00Shaman NarayanasamyHandling multiple fastq filesSometimes `fastq` files may not exist as a single pair of R1 and R2 files. They may exist as multiple pairs of fastq pairs. This is a potential request when we eventually release IMP to the public.
This may require a major refactor.
Sometimes `fastq` files may not exist as a single pair of R1 and R2 files. They may exist as multiple pairs of fastq pairs. This is a potential request when we eventually release IMP to the public.
This may require a major refactor.
https://git-r3lab.uni.lu/IMP/IMP/-/issues/77MaxBin test data too small to give result2017-12-19T09:20:53+01:00Yohan Jaroszyohan.jarosz@uni.luMaxBin test data too small to give resultMaxBin test data on the imp runner is too small to produce results. So the test fails.
We need to provide better test data for that step. For now, the steps in the pipeline CI are set to `manual`MaxBin test data on the imp runner is too small to produce results. So the test fails.
We need to provide better test data for that step. For now, the steps in the pipeline CI are set to `manual`Shaman NarayanasamyShaman Narayanasamyhttps://git-r3lab.uni.lu/IMP/IMP/-/issues/79Metaquast not working2017-11-10T03:44:08+01:00Shaman NarayanasamyMetaquast not workingMetaQUAST report does not appear in the IMP main reportMetaQUAST report does not appear in the IMP main reportShaman NarayanasamyShaman Narayanasamyhttps://git-r3lab.uni.lu/IMP/IMP/-/issues/80Add new binning tools2017-11-10T03:44:08+01:00Shaman NarayanasamyAdd new binning toolsMetaBAT
binnyMetaBAT
binnyShaman NarayanasamyShaman Narayanasamyhttps://git-r3lab.uni.lu/IMP/IMP/-/issues/93"CalledProcessError" while running 'impy init'2018-02-12T15:48:31+01:00tberben"CalledProcessError" while running 'impy init'I am following the IMP documentation regarding installation of impy, but when I run 'impy init' I always get the same error, namely:
``RuleException:
CalledProcessError in line 27 of /home/imp/code/rules/ini/databases.rules:
Command '
...I am following the IMP documentation regarding installation of impy, but when I run 'impy init' I always get the same error, namely:
``RuleException:
CalledProcessError in line 27 of /home/imp/code/rules/ini/databases.rules:
Command '
TMPD=$(mktemp -d -t --tmpdir=/home/imp/output/tmp "XXXXXX")
wget https://webdav-r3lab.uni.lu/public/R3lab/IMP/sortmerna.2.0.tgz --no-check-certificate -O $TMPD/sortmerna.tgz
tar -xzf $TMPD/sortmerna.tgz --strip-components=1 -C $TMPD
mkdir -p sortmerna
mv $TMPD/rRNA_databases/*.fasta sortmerna/.
rm -rf $TMPD
' returned non-zero exit status 4
File "/home/imp/code/rules/ini/databases.rules", line 27, in __rule_sortmerna_databases
File "/usr/lib/python3.4/concurrent/futures/thread.py", line 54, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message``
I'm running IMP 1.4.1 on Ubuntu Trusty, with Docker 17.03.1-ce and python 3.4.3.https://git-r3lab.uni.lu/IMP/IMP/-/issues/96save space, compress reads2017-05-31T11:35:21+02:00Anna Buschartsave space, compress readscould IMP please automatically compress the read files from the different steps?could IMP please automatically compress the read files from the different steps?https://git-r3lab.uni.lu/IMP/IMP/-/issues/97predict rRNAs2017-05-31T11:34:06+02:00Anna Buschartpredict rRNAsI would love to have some rRNAs predicted within IMPI would love to have some rRNAs predicted within IMPhttps://git-r3lab.uni.lu/IMP/IMP/-/issues/98Digital normalization2017-05-31T11:24:59+02:00Shaman NarayanasamyDigital normalizationPotential tools that could be applied are:
- https://arxiv.org/abs/1203.4802
- http://biorxiv.org/content/early/2017/05/03/133579Potential tools that could be applied are:
- https://arxiv.org/abs/1203.4802
- http://biorxiv.org/content/early/2017/05/03/133579https://git-r3lab.uni.lu/IMP/IMP/-/issues/99Metaproteomic analysis2017-05-31T11:26:41+02:00Shaman NarayanasamyMetaproteomic analysisMany methods available, but best candidate so far:
dx.doi.org/10.1371/journal.pcbi.1005224Many methods available, but best candidate so far:
dx.doi.org/10.1371/journal.pcbi.1005224https://git-r3lab.uni.lu/IMP/IMP/-/issues/100Multi sample analysis2017-05-31T11:32:49+02:00Shaman NarayanasamyMulti sample analysisWe would need:
1. A means of providing multiple paired end sequences using:
- direct input to the command line
- a list of samples with their relevant paths (for very large studies)
2. After sample-wise analyses, "representative ...We would need:
1. A means of providing multiple paired end sequences using:
- direct input to the command line
- a list of samples with their relevant paths (for very large studies)
2. After sample-wise analyses, "representative genomes" can be selected using dRep (http://biorxiv.org/content/early/2017/02/13/108142)
3. Reads from each sample should be remapped to the "representative genomes" generated by dRep
4. Other possible enhancements in this regard:
- Choice for a combined assembly (i.e. pooing reads from multiple samples). This strategy might work well for smaller number of samples, technical replicates, analytical replicates and/or highly similar sampleshttps://git-r3lab.uni.lu/IMP/IMP/-/issues/101Automatic creation of databases after analyses2017-05-31T11:36:45+02:00Shaman NarayanasamyAutomatic creation of databases after analysesAt present, we just save text files. Possible solutions:
- MongoDB
- MySQLAt present, we just save text files. Possible solutions:
- MongoDB
- MySQLhttps://git-r3lab.uni.lu/IMP/IMP/-/issues/102Incorporate MultiQC2017-06-12T16:33:33+02:00Shaman NarayanasamyIncorporate MultiQCNice solution for reporting purposes:
http://multiqc.info/Nice solution for reporting purposes:
http://multiqc.info/https://git-r3lab.uni.lu/IMP/IMP/-/issues/103Retain kmer graphs after assembly2017-07-06T14:24:39+02:00Shaman NarayanasamyRetain kmer graphs after assemblyThese could be potentially useful for integrated proteomic analyses.These could be potentially useful for integrated proteomic analyses.https://git-r3lab.uni.lu/IMP/IMP/-/issues/104MissingInputException2017-07-12T09:30:20+02:00jingzhejiangMissingInputExceptionHi!
I have installed and finished 'impy init' on AWS instance, then save it as a AMI.
when I start a new instance (c4.8*large) with this AMI, and run:
```
impy run -t DYL1.fq.gz -t DYL2.fq.gz -o ~/DYL-IMP-default-Output --single-om...Hi!
I have installed and finished 'impy init' on AWS instance, then save it as a AMI.
when I start a new instance (c4.8*large) with this AMI, and run:
```
impy run -t DYL1.fq.gz -t DYL2.fq.gz -o ~/DYL-IMP-default-Output --single-omics
Executing IMP command:
docker run --rm --name home_ubuntu_DYL_IMP_default_Output -v /home/ubuntu/DYL-rawdata/imp-db:/home/imp/databases -v /home/ubuntu/DYL-IMP-default-Output:/home/imp/output -v /home/ubuntu/DYL-rawdata:/home/imp/data -e "LOCAL_USER_ID=`id -u $USER`" -e "LOCAL_GROUP_ID=`id -g $USER`" -e IMP_BINNING_METHOD="maxbin" -e MEMTOTAL="8" -e MEMCORE="2" -e THREADS="4" -e MG="" -e MT="/home/imp/data/DYL1.fq.gz /home/imp/data/DYL2.fq.gz" -e IMP_ASSEMBLER="idba" -e IMP_STEPS="preprocessing assembly analysis binning report" docker-r3lab.uni.lu/imp/imp:1.4.1 snakemake -s /home/imp/code/Snakefile
usermod: no changes
MissingInputException in line 1 of /home/imp/code/rules/Preprocessing/trimming.rules:
Missing input files for rule trimming:
/home/imp/databases/adapters/adapters.done
```
What is the problem?
Thank you!https://git-r3lab.uni.lu/IMP/IMP/-/issues/105megahit iterations: 99-25=74 modulo 4 is 2 and not 02017-10-04T12:29:15+02:00Patrick Maymegahit iterations: 99-25=74 modulo 4 is 2 and not 0you are producing with megahit kmer assemblies from 25 to 97 with increment 4 and additional the 99 kmer assembly, which is an artefact by the megahit tool.
At some point you should change this, also in the documentation, it is misleadin...you are producing with megahit kmer assemblies from 25 to 97 with increment 4 and additional the 99 kmer assembly, which is an artefact by the megahit tool.
At some point you should change this, also in the documentation, it is misleading because 99-25=74 modulo 4 is 2 and not 0https://git-r3lab.uni.lu/IMP/IMP/-/issues/118Automate for samples with multiple runs in sequencing2017-10-03T17:04:56+02:00Shaman NarayanasamyAutomate for samples with multiple runs in sequencingSome samples are run on separate sequencing runs. Usually, users would want to pool such samples and perform analysis as if it were a single sample.
-Shaman-Some samples are run on separate sequencing runs. Usually, users would want to pool such samples and perform analysis as if it were a single sample.
-Shaman-https://git-r3lab.uni.lu/IMP/IMP/-/issues/122Run IMP with testdataset2018-02-09T10:38:55+01:00easternbluebirdRun IMP with testdatasetI tried to run IMP with the test dataset obtained from [here](https://webdav-r3lab.uni.lu/public/R3lab/IMP/test/).
My config file looks like this:
```
{
"threads": 24,
"memory_total_gb": 200,
"memory_per_core_gb": 200,
...I tried to run IMP with the test dataset obtained from [here](https://webdav-r3lab.uni.lu/public/R3lab/IMP/test/).
My config file looks like this:
```
{
"threads": 24,
"memory_total_gb": 200,
"memory_per_core_gb": 200,
"outputdir": "/data/testdata/output/",
"raws": {
"Metagenomics": "/data/testdata/raw_data/metagenomics/",
"Metatranscriptomics": "/data/testdata/raw_data/metatranscriptomics/"
},
"assembler": "idba",
"vizbin": {
"perp": 10,
"cutoff": 1
}
}
```
```
$ tree testdata/
testdata/
├── config.json
└── raw_data
├── metagenomics
│ ├── mg.r1.fq
│ └── mg.r2.fq
└── metatranscriptomics
├── mt.r1.fq
└── mt.r2.fq
```
If I try to run IMP, I get the following error:
```
$ impy -c /data/testdata/config.json run
You should provide `metagenomics` and `metatranscriptomics` data.
Aborted!
```
Am I doing something wrong to access the config file?
(I am aware that I don't need 24 threads and 200GB of memory for this dataset, but I have problems with another dataset and I am trying to find out if the computational ressources are a problem).https://git-r3lab.uni.lu/IMP/IMP/-/issues/123Error with tbl2asn when running with testdata2018-02-12T16:51:33+01:00easternbluebirdError with tbl2asn when running with testdataI have permission issues inside of the docker container and tbl2asn cannot be moved to `/usr/bin`. Any ideas how to fix this?
```
$ impy -a idba --threads 24 --memtotal 200 --memcore 200 run -m testdata/raw_data/metagenomics/mg.r1.fq -m...I have permission issues inside of the docker container and tbl2asn cannot be moved to `/usr/bin`. Any ideas how to fix this?
```
$ impy -a idba --threads 24 --memtotal 200 --memcore 200 run -m testdata/raw_data/metagenomics/mg.r1.fq -m testdata/raw_data/metagenomics/mg.r2.fq -t testdata/raw_data/metatranscriptomics/mt.r1.fq -t testdata/raw_data/metatranscriptomics/mt.r2.fq -o testdata/output | tee testdata/log.txt
```
```
(...)
22 of 58 steps (38%)
done
[2046/5879]
rule update_tbl2asn:
output: Analysis/tbl2asn.updated
[tbl2asn] This copy of tbl2asn is more than a year old. Please
download the current version.
--2018-02-09 10:12:02--
ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/tbl2asn/linux64.tbl2asn.gz
=> '/tmp/tbl2asn.gz'
Resolving ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)... 130.14.250.10,
2607:f220:41e:250::7
Connecting to ftp.ncbi.nih.gov (ftp.ncbi.nih.gov)|130.14.250.10|:21...
connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1)
/toolbox/ncbi_tools/converters/by_program/tbl2asn ... done.
==> SIZE linux64.tbl2asn.gz ... 5342980
==> PASV ... done. ==> RETR linux64.tbl2asn.gz ... done.
Length: 5342980 (5.1M) (unauthoritative)
0K .......... .......... .......... .......... .......... 0%
158K 33s
50K .......... .......... .......... .......... .......... 1%
460K 22s
100K .......... .......... .......... .......... .......... 2%
45.2M 14
(...)
5100K .......... .......... .......... .......... .......... 98%
72.7M 0s
5150K .......... .......... .......... .......... .......... 99%
67.5M 0s
5200K .......... ....... 100%
69.7M=1.1s
2018-02-09 10:12:04 (4.69 MB/s) - '/tmp/tbl2asn.gz' saved [5342980]
mv: cannot move '/tmp/tbl2asn' to '/usr/bin/tbl2asn': Permission denied
Error in job update_tbl2asn while creating output file
Analysis/tbl2asn.updated.
RuleException:
CalledProcessError in line 46 of
/home/imp/code/rules/Analysis/prokka.rule:
Command '
OUT=$(tbl2asn -hp /tmp 2>&1)
echo $OUT
if [[ "$OUT" =~ "copy of tbl2asn is more than a year old" ]]; then
wget
ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/tbl2asn/linux64.tbl2asn.gz
-O /tmp/tbl2asn.gz
gzip -f -d /tmp/tbl2asn.gz
chmod +x /tmp/tbl2asn
mv /tmp/tbl2asn /usr/bin
fi
' returned non-zero exit status 1
File "/usr/lib/python3.4/concurrent/futures/thread.py", line 54, in run
Will exit after finishing currently running jobs.
/home/imp/data/metagenomics/mg.r1.fq => Preprocessing/mg.r1.fq
/home/imp/data/metagenomics/mg.r2.fq => Preprocessing/mg.r2.fq
/home/imp/data/metatranscriptomics/mt.r1.fq => Preprocessing/mt.r1.fq
/home/imp/data/metatranscriptomics/mt.r2.fq => Preprocessing/mt.r2.fq
Exiting because a job execution failed. Look above for error message
```https://git-r3lab.uni.lu/IMP/IMP/-/issues/124Annotate rule fails2018-02-14T08:48:36+01:00Fredrik BoulundAnnotate rule failsHi,
I'm having some issues with the `annotate` rule. Any ideas on what might be wrong, or any pointers I could investigate?
```
(IMP_pipeline) [imp_test]$ impy -e "IMP_SUDO=sudo" --threads 20 -d /db/IMP -a megahit run -m samples/2809_...Hi,
I'm having some issues with the `annotate` rule. Any ideas on what might be wrong, or any pointers I could investigate?
```
(IMP_pipeline) [imp_test]$ impy -e "IMP_SUDO=sudo" --threads 20 -d /db/IMP -a megahit run -m samples/2809_v1_1.fastq.gz -m samples/2809_v1_2.fastq.gz -o 2809_v1 --single-omics
Executing IMP command:
docker run --net=host --rm --name home_fredrik_boulund_workspace_tmp_imp_test_2809_v1 -v /db/IMP:/home/imp/databases -v /home/fredrik.boulund/workspace_tmp/imp_test/2809_v1:/home/imp/output -v /home/fredrik.boulund/workspace_tmp/imp_test/samples:/home/imp/data -e "LOCAL_USER_ID=`id -u $USER`" -e "LOCAL_GROUP_ID=`id -g $USER`" -e IMP_SUDO="sudo" -e IMP_BINNING_METHOD="maxbin" -e MEMTOTAL="8" -e MEMCORE="2" -e THREADS="20" -e MG="/home/imp/data/2809_v1_1.fastq.gz /home/imp/data/2809_v1_2.fastq.gz" -e MT="" -e IMP_ASSEMBLER="megahit" -e IMP_STEPS="preprocessing assembly analysis binning report" docker-r3lab.uni.lu/imp/imp:1.4.1 snakemake -s /home/imp/code/Snakefile
groupmod: GID '100' already exists
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 ALL
1 ANALYSIS
1 ASSEMBLY
1 BINNING
1 PREPROCESSING
1 REPORT
1 annotate
1 assembly_contig_length
1 call_gene_depth
1 diagramm
1 download_librairies_files
1 fastqc_preprocessed
1 fastqc_raw
1 krona
1 local_librarie_files
1 make_report
1 maxbin
1 maxbin_contig2bin
1 metaquast
1 reads_count
1 rename_stat_output
1 variant_calling
1 visualize
1 visualize_maxbin
1 vizbin
25
rule annotate:
input: Assembly/mg.assembly.merged.fa, /home/imp/databases/cm/Bacteria.i1i, /home/imp/databases/genus/Staphylococcus.phr, /home/imp/databases/hmm/CLUSTERS.hmm.h3f, /home/imp/databases/kingdom/Archaea/sprot.phr, Analysis/tbl2asn.updated
output: Analysis/annotation/annotation.filt.gff, Analysis/annotation/prokka.faa, Analysis/annotation/prokka.fna, Analysis/annotation/prokka.ffn, Analysis/annotation/prokka.fsa
Error in job annotate while creating output files Analysis/annotation/annotation.filt.gff, Analysis/annotation/prokka.faa, Analysis/annotation/prokka.fna, Analysis/annotation/prokka.ffn, Analysis/annotation/prokka.fsa.
RuleException:
CalledProcessError in line 13 of /home/imp/code/rules/Analysis/prokka.rule:
Command '
### prokka by default will look databases where is located the binary.
### we have to softlink to put the binary somewhere and the databases somewhere else.
if [[ "/home/imp/databases" = /* ]]
then
PP=/home/imp/databases;
else
PP=$PWD//home/imp/databases;
fi
DD=$(dirname $(which prokka))/../db
if [[ ! -L $DD ]]
then
CUR=$PWD
echo "Softlinking $DD to $PP"
cd $(dirname $(which prokka))/.. && ln -fs $PP db
cd $CUR
fi
rm -rf Analysis/annotation/
prokka --force --outdir Analysis/annotation --prefix prokka --cpus 20 --metagenome Assembly/mg.assembly.merged.fa >> Analysis/annotation.log 2>&1
# Prokka gives a gff file with a long header and with all the contigs at the bottom. The command below removes the
# And keeps only the gff table.
LN=`grep -Hn "^>" Analysis/annotation/prokka.gff | head -n1 | cut -f2 -d ":" || if [[ $? -eq 141 ]]; then true; else exit $?; fi`
LN1=1
LN=$(($LN-$LN1))
head -n $LN Analysis/annotation/prokka.gff | grep -v "^#" | sort | uniq | grep -v "^==" > Analysis/annotation/annotation.filt.gff
' returned non-zero exit status 2
File "/usr/lib/python3.4/concurrent/futures/thread.py", line 54, in run
Removing output files of failed job annotate since they might be corrupted:
Analysis/annotation/prokka.faa, Analysis/annotation/prokka.fna, Analysis/annotation/prokka.ffn, Analysis/annotation/prokka.fsa
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
```https://git-r3lab.uni.lu/IMP/IMP/-/issues/125Error when rebuilding docker container2018-02-12T16:46:38+01:00easternbluebirdError when rebuilding docker containerI think I might have found a fix for #123.
```
#### Add IMP user
RUN groupadd imp && useradd -g imp -d /home/imp imp \
&& chown imp:imp -R /home/imp/ \
&& chmod -R 0777 /home/imp \
&& echo 'imp:imp' |chpasswd \
&& echo "...I think I might have found a fix for #123.
```
#### Add IMP user
RUN groupadd imp && useradd -g imp -d /home/imp imp \
&& chown imp:imp -R /home/imp/ \
&& chmod -R 0777 /home/imp \
&& echo 'imp:imp' |chpasswd \
&& echo "imp ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/imp \
&& chmod 0440 /etc/sudoers.d/imp # <-- I added this line here
```
The last line for this code chunk.
However, I could not test it because I could not rebuild the docker container according to the instructions of `BUILD_TARBALL.md`.
This step fails:
```
$ docker build -t docker-r3lab.uni.lu/imp/imp:1.4.1.1 .
Sending build context to Docker daemon 43.01kB
Step 1/18 : FROM docker-r3lab.uni.lu/imp/imp-tools:1.4.1.1
Get https://docker-r3lab.uni.lu/v2/imp/imp-tools/manifests/1.4.1.1: no basic auth credentials
```
How can I get access to that?