Commit 19640026 authored by Valentina Galata's avatar Valentina Galata
Browse files

created a test data set: FNA, FAA, GFF from IMP3 test data, readme and set-up...

created a test data set: FNA, FAA, GFF from IMP3 test data, readme and set-up script, test config (issue #30)
parent e709e686
Input files
```bash
# *.fna, *.faa and *.gff
rsync -avP /work/projects/ecosystem_biology/local_tools/IMP3/test/testRAW/run150320/Analysis/annotation/prokka.gff test_sample.gff
rsync -avP /work/projects/ecosystem_biology/local_tools/IMP3/test/testRAW/run150320/Analysis/annotation/prokka.fna test_sample.fna
rsync -avP /work/projects/ecosystem_biology/local_tools/IMP3/test/testRAW/run150320/Analysis/annotation/prokka.faa test_sample.faa
# set-up: modify files, create other required files
./set-up.sh
```
#!/bin/bash -l
# modify *.faa: rm record descriptions
sed -i '/^>/ s/ .*//' test_sample.faa
# *.gff to *.contig: contig ID, feature ID
grep -v '^#' test_sample.gff | cut -f1,9 | cut -d';' -f1 | sed 's/ID=//' > test_sample.contig
# List of sample names, i.e. base names of input files
# Each sample should have 3 files: *.fna, *.faa, *.contig
input_file: ["test_sample"]
# Unique project name (used in output directory name)
project: "output_test"
# Data path
# Output will be saved in <OUTDIR>/<project>
OUTDIR: "test"
# Split size of FASTA files (default: 10 000 seqs/file)
size_fasta: 100000
# Workflow (default: "complete")
# complete: complete pipeline: toxin + virulence + (AMR + MGE) prediction
# Tox: toxin prediction
# Vir: virulence prediction
# AMR: antimicrobial resistance (AMR) & mobile genetic element (MGE) prediction
workflow: "complete"
###########
# SignalP #
###########
# SignalP
signalp: "/mnt/irisgpfs/projects/ecosystem_biology/local_tools/SignalP/signalp-4.1/signalp"
#########
# Toxin #
#########
# HMM
hmmscan_tool: "hmmsearch"
hmm_file: "databases/toxins/combined_Toxin.hmm"
#############
# Virulence #
#############
# HMM
vir_hmm_file: "databases/virulence/Virulence_factor.hmm"
#######
# AMR #
#######
# DeepARG
deep_ARG: "submodules/deeparg-ss/deepARG.py"
# Plasflow
Plasflow: "PlasFlow.py"
# Virsorter
virsorter: "wrapper_phage_contigs_sorter_iPlant.pl"
virsorter_data: "scripts/virsorter-data"
# VirFinder
DeepVirFinder: "submodules/DeepVirFinder/dvf.py"
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment