Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • pubchem pubchem
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 11
    • Issues 11
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • External wiki
    • External wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Environmental Cheminformatics
  • pubchempubchem
  • Merge requests
  • !9
The source project of this merge request has been removed.

Generate output based on PubChemLite/MetFrag legend

Merged Todor Kondic requested to merge (removed):remove-tiers into master Nov 03, 2020
  • Overview 9
  • Pipelines 0
  • Changes 15

Major Changes

Input directory and files

The pb-lite-driver.sh script works off an input directory which contains a YAML manifest file and a file mapping bits to PubChem categories to MetFrag columns. For example,

manifest.yaml

# The file containing index bits of pubchem, the pubchem category
# names and the corresponding metfrag column names.
map: PubChemLite_exposomics.map

# The top-level build directory [will be created if does not exist].
topdir: tiers

# Where to store the entire build (for backup and forensics) [must
# exist].
outdir: jimbo

# Directory, or directories where to store output MetFrag files [must
# exist].
mf_dirs:
  - mf_dir1
  - mf_dir2

mapping file (eg PubChemLite_exposomics.map)

BIT,CATEGORY,MFCOL
192,Agrochemical Information,AgroChemInfo
426,Biomolecular Interactions and Pathways,BioPathway
82,Drug and Medication Information,DrugMedicInfo
204,Food Additives and Ingredients,FoodRelated
344,Pharmacology and Biochemistry,PharmacoInfo
356,Safety and Hazards,SafetyInfo
396,Toxicity,ToxicityInfo
350,Use and Manufacturing,KnownUse
137,Associated Disorders and Diseases,DisorderDisease
171,Identification,Identification

Logging

Already in the previous version we log everything into a single file. Now, the log looks nicer (thanks to Log Lady ;-) ) and is easier to search

A snippet of the log

********** PubChem Lite LOG START Wed Nov  4 09:23:18 CET 2020 **********
LOG LADY> (gen_filtfile) Generating filter file: .../PubChemLite_exposomics.filter
LOG LADY> (gen_legend) Generating legend file: .../PubChemLite_exposomics.legend
LOG LADY> [* START *] STAMP
LOG LADY> Current directory is ...
LOG LADY> Input files are located in inputs
LOG LADY> The top-level directory used by PCL is ...
LOG LADY> The build directory is ...
LOG LADY> The filter file used is .../PubChemLite_exposomics.filter
LOG LADY> The output MetFrag file will be PubChemLite_exposomics_20201104.csv
LOG LADY> The scripts have been located in ...
LOG LADY> The full build result is going to be stored in ...
LOG LADY> The MetFrag files are going to be written to the following dirs: mf_dir1 mf_dir2
LOG LADY> Legend file is .../PubChemLite_exposomics.legend
LOG LADY> [* END *] STAMP
LOG LADY> (adapt_scripts): Adapting paths in Perl scripts.
LOG LADY> (adapt_scripts): Adapted analyze_toc_info.pl
LOG LADY> (adapt_scripts): Adapted filter_toc_info.pl
LOG LADY> (adapt_scripts): Adapted fp_merge.pl
LOG LADY> (adapt_scripts): Adapted mapping.pl
LOG LADY> (adapt_scripts): Adapted pull_cid_content.pl
LOG LADY> (adapt_scripts): Adapted read_manifest.pl
LOG LADY> (adapt_scripts): Adapted remove_unwanted_cases.pl
LOG LADY> (adapt_scripts): Adapted rest_grab_props.pl
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> (sanity_prebuild): Warning. No previous build dir available for comparison.
LOG LADY> [* START *] BUILD
LOG LADY> legendfile: .../PubChemLite_exposomics.legend
LOG LADY> [* START *] BUILD rest_grab_props ( Wed Nov 4 09:23:18 CET 2020 )
:: Using legend file: ".../PubChemLite_exposomics.legend" ::
:: Header is AgroChemInfo	BioPathway	DrugMedicInfo	FoodRelated	PharmacoInfo	SafetyInfo	ToxicityInfo	KnownUse	DisorderDisease	Identification

LOG LADY> [* END *] BUILD rest_grab_props ( Wed Nov 4 09:23:18 CET 2020 )
LOG LADY> [* END *] BUILD
********** PubChem Lite LOG END Wed Nov  4 09:23:18 CET 2020 **********`

Exit on fatal errors

This is finally properly implemented. Even subshells behave as they are supposed to.

Wink, wink @emma.schymanski .

Edited Nov 05, 2020 by Todor Kondic
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: remove-tiers