Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • P PathoFact
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 25
    • Issues 25
    • List
    • Boards
    • Service Desk
    • Milestones
    • Requirements
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Terraform modules
    • Model experiments
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • External wiki
    • External wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Laura Denies
  • PathoFact
  • Milestones
  • PathoFact v2
Expired
Milestone Feb 28, 2022–Mar 31, 2022

PathoFact v2
Milestone ID: 407

Notes

To be updated

Requirements

  • new(er/est) version of snakemake
  • rgi: should now be able to handle * in FAA files
    • important: does removing * from FAA sequences affect some results or should that be done in any case (?)
  • signalp, v6
    • paper
    • web site
    • repo
    • need to check compatibility
  • optional: plasflow --> should alternatives be added?

Configuration

Config

  • no runtime configuration (only in rules and profiles)
    • rm runtime, mem
  • sample table (see also below)
    • ID, FNA, (FAA, FNA/FAA mapping)
  • no project and datadir
    • either only one output path or use work folder via profiles
  • paths to DBs: single path? group by attribute?
  • allow multiple steps (see also below)

Sample table

  • columns: ID, FNA, (FAA, FNA/FAA mapping)

Profiles

  • different types:
    • generic (w/o a scheduler)
    • HPC w/ slurm (simple setup)
  • working directory (???)

Workflow

  • standardized structure
  • standardized rule structure
    • benchmark and log files
  • use tmp, shadow rules for temp output
  • conda env. YAML files: update/clean
  • config validation
    • schemas for config and sample table
  • save config to output
  • set working directory
  • replace checkpoints with split/gather
    • consider whether splitting should be kept for all steps or not
    • seqkit: split2
    • todo: need to kno w the total number of seq.s
    • todo: signalp limit for number of seq.s?
  • allow different step combinations
  • output sub-folder per sample
# example for step a combination
workflows:
  - vir: true
  - amr: true
  - tox: false

Rules

  • rule resources

Testing

  • Use snakemake's unit test utility
  • CI

GitLab

  • templates for issues
Assign some issues to this milestone.
  • Issues 0
  • Merge requests 0
  • Participants 0
  • Labels 0
0% complete
0%
Start date
Feb 28, 2022
Feb 28
-
Mar 31 2022
Due date
Mar 31, 2022 (Past due)
0
Issues 0 New issue
Open: 0 Closed: 0
None
Total issue weight
None
0
Merge requests 0
Open: 0 Closed: 0 Merged: 0
0
Releases
None
Reference: laura.denies/PathoFact%"PathoFact v2"