# About Results archive created for manuscript submission. # Archived files Abbreaviations and names: - sample names: Zymo, NWC, GDB, Rumen - SR: short-read (data) - LR: long-read (data) - Hy: hybrid (SR+LR) - metaG: metagenomic data - metaT: metatranscriptomic data - only for GDB - metap: metaproteomic data - only for GDB For each sample, the following files were saved (in `/results/`): - preprocessing stats: - SR (metaG/metaT), FastP stats: `preproc/*/sr/fastp.*` - SR/LR (metaG/metaT, GDB), bbmap stats: `preproc/*/*/bbmap.*` - reads QC: - SR (metaG/meetaT), FastQC: `qc/*/sr/*_fastqc.zip` - LR, NanoStats: `qc/*/lr/NanoStats.*` - assembly - "raw" contigs: `assembly/*/*/ASSEMBLY.fasta` - polished contigs: `assembly/*/*/ASSEMBLY.POLISHED.fasta` (for LR and Hy only) - assembly mapping - mapping rate summary stats: `mapping/*/mappability.tsv` - average metaT coverage per gene (GDB): `mapping/metat/*/*/ASSEMBLY.POLISHED.sr.cov.pergene` - annotation - protein prediction w/ Prodigal - FAA FASTA file: `annotation/prodigal/*/*/proteins.faa` - summary (gene counts): `annotation/prodigal/summary.gene.counts.tsv` - summary (summary gene length): `annotation/prodigal/summary.gene.length.tsv` - AMR prediction w/ RGI (CARD) - RGI output: `annotation/rgi/*/*/rgi.txt` - summary: `annotation/rgi/summary.tsv` - rRNA gene prediction w/ barrnap - GFF file: `annotation/barrnap/*/*/*.gff` - summary: `annotation/barrnap/summary.tsv` - analysis: - assembly quality stats w/ QUAST - QUAST report: `analysis/quast/*/*/report.tsv` - sumamry: `analysis/quast/summary_report.tsv` - analysis w/ Mash: `analysis/mash/contigs.*` (sketch and distances) - DIAMOND search in UniProt database - DIAMOND hits: `analysis/diamond/*.tsv` - summary: `analysis/diamond/summary.db.tsv` - protein clustering w/ mmseqs2 - clusters: `analysis/mmseqs2/clusters.tsv` - summary: `analysis/mmseqs2/summary.tsv` - extra data/analysis for GDB: - metaP reports: `metap/20210323/*_Default_Peptide_Report.txt`, `metap/20210323/*_Default_Protein_Report.txt` - AMR/metaT analysis: `report/amr.tsv` - "high-confidence" proteins: `report/mmseqs2_highconf.tsv` - metaT cov. of exclusive proteins: `report/mmseqs2_uniq.tsv` - metaP-based summary: `report/metap.txt`