Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • I IMP3
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 28
    • Issues 28
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 3
    • Merge requests 3
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • External wiki
    • External wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • IMP
  • IMP3
  • Issues
  • #51

Closed
Open
Created Sep 30, 2021 by Valentina Galata@valentina.galataMaintainer

Preprocessing: kneaddata for reads filtering

Feature request

I would propose to consider to use kneaddata for reads filtering.

This tool aims to perform principled in silico separation of bacterial reads from these "contaminant" reads, be they from the host, from bacterial 16S sequences, or other user-defined sources.

  • can be installed via conda
  • can use multiple references for filtering
  • outputs reads mapped to each given reference in separate FASTQ files
  • (runs fastqc for the input/output FASTQ files)

The rRNA filtering step could be included there as well or it could still be a separate rule. With or without the rRNA filtering, this would reduce the code complexity considerably: there would be no need for those "chained" FASTQ files with multiple filtering-suffixes in their names.

The trimming step included in kneaddata can and has to be skipped because of the optional poly-G trimming which has to be done prior to filtering.

kneaddata:

  • web site
  • repo
  • tutorials
  • forum
Assignee
Assign to
Time tracking