Commit 7e804b12 authored by Emma Schymanski's avatar Emma Schymanski
Browse files

added PUG REST section

... plus MetFrag details - and other minor tweaks
parent 5177c750
pfas-tree/.Rhistory
---
title: "PFAS and Fluorinated Organic Compounds in PubChem Tree"
author:
- "Emma L. Schymanski^1^, Parviel Chirsir^1^,"
- "Emma L. Schymanski^1^, Parviel Chirsir^1^, Todor Kondic^1^"
- "Paul A. Thiessen^2^, Jian Zhang^2^ and Evan E. Bolton^2^"
date: "24/03/2022"
output: pdf_document
......@@ -12,13 +12,15 @@ urlcolor: blue
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
#knitr::opts_chunk$set(fig.pos = "!H", out.extra = "")
```
^1^ Luxembourg Centre for Systems Biomedicine (LCSB),
University of Luxembourg, 6 avenue du Swing, 4367, Belvaux, Luxembourg.
ELS: ORCID [0000-0001-6868-8145](http://orcid.org/0000-0001-6868-8145)
PC: ORCID [0000-0002-9932-8609](http://orcid.org/0000-0002-9932-8609)
ELS: ORCID [0000-0001-6868-8145](http://orcid.org/0000-0001-6868-8145),
PC: ORCID [0000-0002-9932-8609](http://orcid.org/0000-0002-9932-8609),
TK: ORCID [0000-0001-6662-4375](https://orcid.org/0000-0001-6662-4375)
^2^ National Center for Biotechnology Information, National
Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
......@@ -55,12 +57,13 @@ This document is organised into several sections, as follows:
|_OECD PFAS Definition_ | [Go to heading](#oecddef) | 2 |
|_Organofluorine Compounds_ | [Go to heading](#orgf) | 5 |
|_PFAS and Fluorinated Organic Compound Collections_ | [Go to heading](#lists) | 5 |
|Navigating the Tree | [Go to heading](#search) | 5 |
|_Search via PubChem Search_ | [Go to heading](#pc-search) | 5 |
|Navigating the Tree | [Go to heading](#search) | 6 |
|_Search via PubChem Search_ | [Go to heading](#pc-search) | 6 |
|_Interactions via Entrez_ | [Go to heading](#entrez) | 6 |
|Implementation | [Go to heading](#impl) | 6 |
|Statements | [Go to heading](#statements) | 6 |
|References | [Go to heading](#statements) | 6 |
|_Interactions via PUG REST_ | [Go to heading](#pugrest) | 6 |
|Implementation | [Go to heading](#impl) | 7 |
|Statements and References | [Go to heading](#statements) | 8 |
<!-- |References | [Go to heading](#statements) | 6 | -->
To become more familiar with the PubChem Classification Browser features
......@@ -199,10 +202,14 @@ the load time for large parts of the tree. It is possible to use some
advanced search and querying capabilities to improve the interaction
with the tree, see [Navigating the Tree](#search) below.
The _PFAS Parts Larger than CF~2~/CF~3~_ will soon be available as
a [MetFrag](https://msbi.ipb-halle.de/MetFrag/) file for further interactions.
The _PFAS Parts Larger than CF~2~/CF~3~_ is available as
a [MetFrag](https://msbi.ipb-halle.de/MetFrag/) file for further use
[@pubchem_pfas_metfrag_2022]. The CSV can be downloaded from Zenodo
(DOI:[10.5281/zenodo.6385954](https://doi.org/10.5281/zenodo.6385954))
for use in MetFragCL, it should be available from the MetFragWeb
drop down menu soon. See the description on the Zenodo record for
more details.
#### TODO: add to Zenodo
### Organofluorine Compounds {#orgf}
......@@ -249,7 +256,74 @@ Content still to come ...
### Interactions via PUG REST {#pugrest}
Content still to come ...
It is also possible to interact with the PubChem PFAS Tree
programmatically. For more extensive details on PUG REST
and other programmatic access than contained below,
please see the PubChem documentation:
- https://pubchemdocs.ncbi.nlm.nih.gov/programmatic-access
- https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest
- https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest-tutorial
- https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest$classification_nodes
The following contains a few tips to start interacting with the tree in R;
note that some of these features are also in active development.
To start, set up all packages, download latest scripts and source them:
```{r initialize}
library(rjson)
library(httr)
source("getPcHidTree.R")
```
Then, retrieve the information behind the PFAS tree. The hid number
is from the URL, _i.e._ hid=120. The following code retrieves the
details and structure to a depth of 3 (further subnodes can be
retrieved by increasing the number).
```{r get PFAS tree}
tree_csv <- getPcHidTree(120,3)
tree_csv
```
Then, load the file & take a look:
```{r read PFAS tree}
PFAS_tree <- read.csv(tree_csv, stringsAsFactors = F)
PFAS_tree[1:5,c(6,7,10)]
```
It is possible to subset by keyword to retain only an interesting subset of
entries, here _e.g._ to find the OntoChem lists out of the PFAS collections:
```{r find OntoChem lists}
i_OntoChem_lists <- grep("OntoChem",PFAS_tree$nodeNames)
OntoChem_PFAS_lists <- PFAS_tree[i_OntoChem_lists,]
OntoChem_PFAS_lists[,c(6,7,10)]
```
Once the PFAS Tree output is available, including node HNIDs, it is
possible to build the URL needed to retrieve the CID listings
per node entry via
[PUG REST](https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest$classification_nodes).
```{r add URL}
# https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/<integer>/<id type>/<format>
hnid_base_url <- "https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/"
hnid_end_url <- "/cids/TXT"
PFAS_tree$REST_URL <- ""
for (i in 1:length(PFAS_tree$nodeHNID)) {
PFAS_tree$REST_URL[i] <- paste0(hnid_base_url,PFAS_tree$nodeHNID[i],hnid_end_url)
}
```
Finally, write the output for further use:
```{r output list}
write.csv(PFAS_tree, file="PubChem_PFAS_Tree_Details.csv",row.names = F)
```
## Implementation {#impl}
......@@ -258,7 +332,6 @@ Content still to come ...
- test set
- notes on implementation
- MetFrag files
......
No preview for this file type
"hid","SourceName","SourceID","HNID","nCIDs","nodeNames","nodeHNID","nodeIDs","parentIDs","node_nCIDs","REST_URL"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OECD PFAS definition",5517102,"node_1","root",6096212,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517102/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains isolated CF2",5521752,"node_2","node_1",601930,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521752/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains CF2 and larger PFAS parts",5516635,"node_3","node_2",12284,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516635/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2",5519454,"node_1062","node_2",518285,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519454/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2/CF3",5524746,"node_1076","node_2",71361,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524746/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains isolated CF3",5523183,"node_1147","node_1",5414868,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523183/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains CF3 and larger PFAS parts",5518208,"node_1219","node_1147",25025,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518208/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2/CF3",5517629,"node_1148","node_1147",71361,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517629/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF3",5520213,"node_2389","node_1147",5318482,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520213/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains PFAS parts larger than CF2/CF3",5525061,"node_2420","node_1",188084,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5525061/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Breakdown by isolated PFAS part count",5520278,"node_2421","node_2420",188084,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520278/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Breakdown by isolated PFAS part type",5524707,"node_5660","node_2420",188084,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524707/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Organofluorine compounds",5523075,"node_8982","root",19080012,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523075/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aliphatic substances",5524117,"node_9398","node_8982",820900,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524117/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aliphatic substances that have a fully fluorinated methyl or methylene carbon atom",5520826,"node_9486","node_9398",536283,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520826/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other fluorinated aliphatic substances that do NOT have a fully fluorinated methyl or methylene carbon atom",5520510,"node_9399","node_9398",284617,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520510/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic substances",5521756,"node_8983","node_8982",18258541,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521756/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"(Non-)Fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that do NOT have a fully fluorinated methyl or methylene carbon atom",5524683,"node_9311","node_8983",1441556,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524683/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that have a fully fluorinated methyl or methylene carbon atom",5520571,"node_9234","node_8983",818299,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520571/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic ring(s) with non-fluorinated aliphatic side chain(s)",5516437,"node_8984","node_8983",11311085,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516437/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic substances without a side chain",5519611,"node_9151","node_8983",34597,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519611/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Non-fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that have fully fluorinated methyl or methylene carbon atom",5519122,"node_9069","node_8983",4653004,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519122/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other fluorinated substances",5525297,"node_9571","node_8982",571,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5525297/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 1 Fluorine atom",5523988,"node_9587","node_9571",370,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523988/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 2 Fluorine atoms",5520669,"node_9578","node_9571",112,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520669/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 3 Fluorine atoms",5519498,"node_9575","node_9571",40,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519498/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 4 Fluorine atoms",5518047,"node_9572","node_9571",24,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518047/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 5 Fluorine atoms",5522125,"node_9584","node_9571",12,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522125/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 6 Fluorine atoms",5524803,"node_9593","node_9571",11,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524803/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 7 Fluorine atoms",5525169,"node_9597","node_9571",2,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5525169/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"PFAS and Fluorinated Organic Compound Collections",5518087,"node_8923","root",36235,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518087/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"CompTox Chemicals Dashboard PFAS Suspect Lists",5519025,"node_8937","node_8923",8498,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519025/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFAS75S1] PFAS|EPA: List of 75 Test Samples (Set 1)",5516407,"node_8938","node_8937",73,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516407/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFAS75S2] PFAS|EPA: List of 75 Test Samples (Set 2)",5520863,"node_8958","node_8937",75,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520863/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASCAT] PFAS|EPA Structure-based Categories",5516769,"node_8939","node_8937",81,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516769/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDW537] PFAS|EPA|WATER: Existing EPA DW Method 537.1",5521910,"node_8963","node_8937",18,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521910/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDW] PFAS|EPA: New EPA Method Drinking Water",5522971,"node_8967","node_8937",25,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522971/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDWTREAT] PFAS|EPA|WATER: Drinking Water Treatment Technology",5518051,"node_8945","node_8937",8,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518051/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINSOL] PFAS|EPA: Chemical Inventory Insoluble in DMSO",5521524,"node_8961","node_8937",42,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521524/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINV] PFAS|EPA: ToxCast Chemical Inventory",5517742,"node_8943","node_8937",427,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517742/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINVIVO] PFAS|EPA: In Vivo Studies Available",5517886,"node_8944","node_8937",22,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517886/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASLITSEARCH] PFAS|EPA: Literature Search Completed",5521727,"node_8962","node_8937",22,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521727/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASNONDW] PFAS|EPA: New EPA Method Non-Drinking Water",5524299,"node_8969","node_8937",23,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524299/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASRESEARCH] PFAS|EPA: EPA PFAS Research List",5524835,"node_8972","node_8937",164,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524835/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASRL] PFAS|EPA: Cross-Agency Research List",5518845,"node_8950","node_8937",192,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518845/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASTOX] PFAS|EPA: Toxicity Assessments",5516853,"node_8941","node_8937",8,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516853/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASVALDW] PFAS|EPA|WATER: PFAS with Validated EPA Drinking Water Methods",5524434,"node_8970","node_8937",30,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524434/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASDEV1] PFAS|EPA PFAS chemicals without explicit structures",5518472,"node_8948","node_8937",44,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518472/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASGLUEGE] PFAS|NORMAN: Overview of PFAS Uses from Gluege et al (2020)",5519721,"node_8954","node_8937",482,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519721/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASINVITRO] PFAS|EPA: List of chemicals tested in in vitro methods 2019-2020",5519243,"node_8953","node_8937",181,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519243/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASKEMI] PFAS: List from the Swedish Chemicals Agency (KEMI) Report",5523875,"node_8968","node_8937",1472,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523875/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASLCMSGCMS] PFAS: Collection of GC-MS and LC-MS standards: Food Contact Materials",5521221,"node_8960","node_8937",37,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521221/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASMASTER] PFAS Master List of PFAS Substances (Version 2)",5518495,"node_8949","node_8937",8116,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518495/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASNORDIC] PFAS: Nordic PFAS Report 2019",5522465,"node_8964","node_8937",202,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522465/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASNTREV19] PFAS: PFAS in Non-Target HRMS Studies (Liu et al 2019)",5522601,"node_8966","node_8937",126,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522601/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASOECD] PFAS: Listed in OECD Global Database",5518176,"node_8946","node_8937",3701,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518176/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASOECDNA] NORMAN: List of PFAS from the OECD Curated by Nikiforos Alygizakis",5519841,"node_8955","node_8937",3205,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519841/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASPACKAGING] PFAS|EPA PFAS Substances in Pesticide Packaging",5520210,"node_8956","node_8937",7,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520210/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCT] Navigation Panel to PFAS Structure Lists",5524542,"node_8971","node_8937",8078,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524542/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV1] PFAS|EPA: PFAS structures in DSSTox (update March 2018)",5520485,"node_8957","node_8937",4350,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520485/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV2] PFAS|EPA: PFAS structures in DSSTox (update November 2019)",5519088,"node_8952","node_8937",6624,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519088/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV3] PFAS|EPA: PFAS structures in DSSTox (update August 2020)",5516782,"node_8940","node_8937",8136,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516782/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTv4] PFAS|EPA: PFAS structures in DSSTox (update August 2021)",5518869,"node_8951","node_8937",8078,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518869/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTDB] WATER|PFAS: PFAS Chemicals contained in the EPA Drinking Water Treatability Database",5520989,"node_8959","node_8937",37,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520989/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTOXDB] PFAS: PFAS-Tox Database",5525323,"node_8973","node_8937",42,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5525323/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTRI] PFAS: PFAS to the Toxics Release Inventory (TRI) Program by the National Defense Authorization Act",5517389,"node_8942","node_8937",97,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517389/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTRIER] PFAS Community-Compiled List (Trier et al. 2015)",5522560,"node_8965","node_8937",588,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522560/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PRORISKPFAS] NORMAN|List of PFAS Compiled from NORMAN-SusDat",5518201,"node_8947","node_8937",3371,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5518201/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"NORMAN-SLE PFAS Suspect Lists",5517745,"node_8928","node_8923",5884,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517745/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S09 | PFASTRIER | PFAS Suspect List of fluorinated substances from X. Trier and colleagues",5523688,"node_8934","node_8928",468,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523688/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S14 | KEMIPFAS | PFAS Highly Fluorinated Substances List from KEMI",5524111,"node_8936","node_8928",1344,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524111/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S25 | OECDPFAS | List of PFAS from the OECD",5522807,"node_8932","node_8928",3692,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522807/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S46 | PFASNTREV19 | List of PFAS reported in Non-Target HRMS Studies from Liu et al 2019",5523394,"node_8933","node_8928",680,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523394/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S80 | PFASGLUEGE | Overview of PFAS Uses",5522532,"node_8931","node_8928",1250,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522532/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S89 | PRORISKPFAS | List of PFAS Compiled from NORMAN SusDat",5516725,"node_8929","node_8928",4240,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5516725/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S92 | FLUOROPHARMA | List of 340 ATC classified fluoro-pharmaceuticals",5520737,"node_8930","node_8928",290,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5520737/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S94 | FLUOROPEST | List of 423 FRAC/HRAC/IRAC classified fluoro-agrochemicals",5523938,"node_8935","node_8928",318,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5523938/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS Lists",5517067,"node_8924","node_8923",26805,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517067/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Definition A",5522450,"node_8926","node_8924",26805,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5522450/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Definition B",5524740,"node_8927","node_8924",4114,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524740/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Defintion C",5521474,"node_8925","node_8924",3432,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521474/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other Organic Fluorinated Chemical Content in PubChem",5524741,"node_8974","node_8923",1674,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5524741/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"MeSH: Fluorinated Hydrocarbons",5517545,"node_8975","node_8974",295,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5517545/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"CAMEO Chemicals: Fluorinated Organic Compounds",5521039,"node_8981","node_8974",120,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5521039/cids/TXT"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"ChEBI: Organofluorine Compound",5519872,"node_8980","node_8974",1372,"https://pubchem.ncbi.nlm.nih.gov/rest/pug/classification/hnid/5519872/cids/TXT"
"hid","SourceName","SourceID","HNID","nCIDs","nodeNames","nodeHNID","nodeIDs","parentIDs","node_nCIDs"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OECD PFAS definition","5517102","node_1","root","6096212"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains isolated CF2","5521752","node_2","node_1","601930"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains CF2 and larger PFAS parts","5516635","node_3","node_2","12284"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2","5519454","node_1062","node_2","518285"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2/CF3","5524746","node_1076","node_2","71361"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains isolated CF3","5523183","node_1147","node_1","5414868"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains CF3 and larger PFAS parts","5518208","node_1219","node_1147","25025"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF2/CF3","5517629","node_1148","node_1147","71361"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains only isolated CF3","5520213","node_2389","node_1147","5318482"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Molecule contains PFAS parts larger than CF2/CF3","5525061","node_2420","node_1","188084"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Breakdown by isolated PFAS part count","5520278","node_2421","node_2420","188084"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Breakdown by isolated PFAS part type","5524707","node_5660","node_2420","188084"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Organofluorine compounds","5523075","node_8982","root","19080012"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aliphatic substances","5524117","node_9398","node_8982","820900"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aliphatic substances that have a fully fluorinated methyl or methylene carbon atom","5520826","node_9486","node_9398","536283"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other fluorinated aliphatic substances that do NOT have a fully fluorinated methyl or methylene carbon atom","5520510","node_9399","node_9398","284617"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic substances","5521756","node_8983","node_8982","18258541"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"(Non-)Fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that do NOT have a fully fluorinated methyl or methylene carbon atom","5524683","node_9311","node_8983","1441556"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that have a fully fluorinated methyl or methylene carbon atom","5520571","node_9234","node_8983","818299"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic ring(s) with non-fluorinated aliphatic side chain(s)","5516437","node_8984","node_8983","11311085"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Fluorinated aromatic substances without a side chain","5519611","node_9151","node_8983","34597"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Non-fluorinated aromatic ring(s) with fluorinated aliphatic side chain(s) that have fully fluorinated methyl or methylene carbon atom","5519122","node_9069","node_8983","4653004"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other fluorinated substances","5525297","node_9571","node_8982","571"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 1 Fluorine atom","5523988","node_9587","node_9571","370"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 2 Fluorine atoms","5520669","node_9578","node_9571","112"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 3 Fluorine atoms","5519498","node_9575","node_9571","40"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 4 Fluorine atoms","5518047","node_9572","node_9571","24"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 5 Fluorine atoms","5522125","node_9584","node_9571","12"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 6 Fluorine atoms","5524803","node_9593","node_9571","11"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Contains 7 Fluorine atoms","5525169","node_9597","node_9571","2"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"PFAS and Fluorinated Organic Compound Collections","5518087","node_8923","root","36235"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"CompTox Chemicals Dashboard PFAS Suspect Lists","5519025","node_8937","node_8923","8498"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFAS75S1] PFAS|EPA: List of 75 Test Samples (Set 1)","5516407","node_8938","node_8937","73"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFAS75S2] PFAS|EPA: List of 75 Test Samples (Set 2)","5520863","node_8958","node_8937","75"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASCAT] PFAS|EPA Structure-based Categories","5516769","node_8939","node_8937","81"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDW537] PFAS|EPA|WATER: Existing EPA DW Method 537.1","5521910","node_8963","node_8937","18"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDW] PFAS|EPA: New EPA Method Drinking Water","5522971","node_8967","node_8937","25"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASDWTREAT] PFAS|EPA|WATER: Drinking Water Treatment Technology","5518051","node_8945","node_8937","8"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINSOL] PFAS|EPA: Chemical Inventory Insoluble in DMSO","5521524","node_8961","node_8937","42"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINV] PFAS|EPA: ToxCast Chemical Inventory","5517742","node_8943","node_8937","427"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASINVIVO] PFAS|EPA: In Vivo Studies Available","5517886","node_8944","node_8937","22"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASLITSEARCH] PFAS|EPA: Literature Search Completed","5521727","node_8962","node_8937","22"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASNONDW] PFAS|EPA: New EPA Method Non-Drinking Water","5524299","node_8969","node_8937","23"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASRESEARCH] PFAS|EPA: EPA PFAS Research List","5524835","node_8972","node_8937","164"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASRL] PFAS|EPA: Cross-Agency Research List","5518845","node_8950","node_8937","192"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASTOX] PFAS|EPA: Toxicity Assessments","5516853","node_8941","node_8937","8"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[EPAPFASVALDW] PFAS|EPA|WATER: PFAS with Validated EPA Drinking Water Methods","5524434","node_8970","node_8937","30"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASDEV1] PFAS|EPA PFAS chemicals without explicit structures","5518472","node_8948","node_8937","44"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASGLUEGE] PFAS|NORMAN: Overview of PFAS Uses from Gluege et al (2020)","5519721","node_8954","node_8937","482"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASINVITRO] PFAS|EPA: List of chemicals tested in in vitro methods 2019-2020","5519243","node_8953","node_8937","181"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASKEMI] PFAS: List from the Swedish Chemicals Agency (KEMI) Report","5523875","node_8968","node_8937","1472"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASLCMSGCMS] PFAS: Collection of GC-MS and LC-MS standards: Food Contact Materials","5521221","node_8960","node_8937","37"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASMASTER] PFAS Master List of PFAS Substances (Version 2)","5518495","node_8949","node_8937","8116"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASNORDIC] PFAS: Nordic PFAS Report 2019","5522465","node_8964","node_8937","202"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASNTREV19] PFAS: PFAS in Non-Target HRMS Studies (Liu et al 2019)","5522601","node_8966","node_8937","126"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASOECD] PFAS: Listed in OECD Global Database","5518176","node_8946","node_8937","3701"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASOECDNA] NORMAN: List of PFAS from the OECD Curated by Nikiforos Alygizakis","5519841","node_8955","node_8937","3205"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASPACKAGING] PFAS|EPA PFAS Substances in Pesticide Packaging","5520210","node_8956","node_8937","7"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCT] Navigation Panel to PFAS Structure Lists","5524542","node_8971","node_8937","8078"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV1] PFAS|EPA: PFAS structures in DSSTox (update March 2018)","5520485","node_8957","node_8937","4350"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV2] PFAS|EPA: PFAS structures in DSSTox (update November 2019)","5519088","node_8952","node_8937","6624"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTV3] PFAS|EPA: PFAS structures in DSSTox (update August 2020)","5516782","node_8940","node_8937","8136"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASSTRUCTv4] PFAS|EPA: PFAS structures in DSSTox (update August 2021)","5518869","node_8951","node_8937","8078"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTDB] WATER|PFAS: PFAS Chemicals contained in the EPA Drinking Water Treatability Database","5520989","node_8959","node_8937","37"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTOXDB] PFAS: PFAS-Tox Database","5525323","node_8973","node_8937","42"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTRI] PFAS: PFAS to the Toxics Release Inventory (TRI) Program by the National Defense Authorization Act","5517389","node_8942","node_8937","97"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PFASTRIER] PFAS Community-Compiled List (Trier et al. 2015)","5522560","node_8965","node_8937","588"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"[PRORISKPFAS] NORMAN|List of PFAS Compiled from NORMAN-SusDat","5518201","node_8947","node_8937","3371"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"NORMAN-SLE PFAS Suspect Lists","5517745","node_8928","node_8923","5884"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S09 | PFASTRIER | PFAS Suspect List of fluorinated substances from X. Trier and colleagues","5523688","node_8934","node_8928","468"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S14 | KEMIPFAS | PFAS Highly Fluorinated Substances List from KEMI","5524111","node_8936","node_8928","1344"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S25 | OECDPFAS | List of PFAS from the OECD","5522807","node_8932","node_8928","3692"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S46 | PFASNTREV19 | List of PFAS reported in Non-Target HRMS Studies from Liu et al 2019","5523394","node_8933","node_8928","680"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S80 | PFASGLUEGE | Overview of PFAS Uses","5522532","node_8931","node_8928","1250"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S89 | PRORISKPFAS | List of PFAS Compiled from NORMAN SusDat","5516725","node_8929","node_8928","4240"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S92 | FLUOROPHARMA | List of 340 ATC classified fluoro-pharmaceuticals","5520737","node_8930","node_8928","290"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"S94 | FLUOROPEST | List of 423 FRAC/HRAC/IRAC classified fluoro-agrochemicals","5523938","node_8935","node_8928","318"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS Lists","5517067","node_8924","node_8923","26805"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Definition A","5522450","node_8926","node_8924","26805"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Definition B","5524740","node_8927","node_8924","4114"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"OntoChem PFAS from CORE - Defintion C","5521474","node_8925","node_8924","3432"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"Other Organic Fluorinated Chemical Content in PubChem","5524741","node_8974","node_8923","1674"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"MeSH: Fluorinated Hydrocarbons","5517545","node_8975","node_8974","295"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"CAMEO Chemicals: Fluorinated Organic Compounds","5521039","node_8981","node_8974","120"
120,"PubChem","pfas5_pubchem_tree",5516029,19156786,"ChEBI: Organofluorine Compound","5519872","node_8980","node_8974","1372"
# Functions to extract data from PubChem Classification Trees
# E. Schymanski, 30/10/2020 - 12/11/2020
# Plus Evan Bolton and Paul Thiessen
# Functions ported from hid_tree_JSON.R
# Function copied from hid_tree_functions.R
# https://gitlab.lcsb.uni.lu/eci/pubchem/-/raw/master/annotations/keywords/hids/hid_tree_functions.R
## this function returns a CSV file containing information about the contents
## of one entire tree, where query = hid number
## depth=2 can be used to specify how deep; this is definitely needed
## for large trees like hid=1 and 2.
getPcHidTree <- function(query,depth=2)
{
baseURL <- "https://pubchem.ncbi.nlm.nih.gov/classification/cgi/classifications.fcgi?"
if (!is.na(depth)) {
url <- paste0(baseURL, "format=json","&depth=",depth,"&hid=", query)
} else {
url <- paste0(baseURL, "format=json&hid=", query)
}
errorvar <- 0
currEnvir <- environment()
tryCatch(
{# data <- getURL(URLencode(url),timeout=8),
res <- GET(URLencode(url))
data <- httr::content(res, type="text", encoding="UTF-8")
},
error=function(e){
currEnvir$errorvar <- 1
})
if(errorvar){
return(NA)
}
# get the data into useful format
r <- fromJSON(data)
if(!is.null(r$Fault))
return(NA)
dat <- r$Hierarchies$Hierarchy[[1]]
hid <- dat$HID
source_name <- dat$SourceName
sourceID <- dat$SourceID
hnid <- dat$Information$HNID
# to extract as numbers (but this won't help as they are listed as node_XXX)
# childIDs <- as.numeric(sub("node_","",dat$Information$ChildID))
childIDs <- dat$Information$ChildID
# this is assuming CIDs are always first and SIDs second
#otherwise this would have to test the "Type"
#types <- dat$Information$Counts
i_cid <- NA
for (i in 1:length(dat$Information$Counts)) {
type <- dat$Information$Counts[[i]]$Type
if (type=="CID" || type=="Compound") {
i_cid <- i
}
}
if (!is.na(i_cid)) {
n_cids <- dat$Information$Counts[[i_cid]]$Count
}
# look further below for code that still needs updating ...
# n_cids <- dat$Information$Counts[[1]]$Count
# if (length(dat$Information$Counts)>1) {
# # then maybe we have SIDs too
# n_sids <- dat$Information$Counts[[2]]$Count
# } else {
# n_sids <- 0
# }
n_nodes <- length(dat$Node)
#nodeNames <- which(unlist(lapply(dat$Node, function(i) !is.null(i$Information$Name))))
nodeNames <- unlist(lapply(dat$Node, function(i) i$Information$Name))
# for KEGG we seem to have Description not Names
nodeDesc <- unlist(lapply(dat$Node, function(i) i$Information$Description))
nodeHNID <- unlist(lapply(dat$Node, function(i) i$Information$HNID))
# now test if nodeNames is NULL, if so, replace with Desc
if (is.null(nodeNames) && !is.null(nodeDesc)) {
nodeNames <- nodeDesc
} else if (is.null(nodeNames) && is.null(nodeDesc)) {
warn_msg <- paste0("Neither Name nor Description available for Nodes for ",
"hid ",query)
warning(warn_msg)
nodeNames <- vector(mode="character",length=length(nodeHNID))
}
# not all entries have CIDs
#node_info <- dat$Node
i_node_nCIDs <- vector(mode="numeric",length=n_nodes)
node_nCIDs <- vector(mode="numeric",length=n_nodes)
for (i in 1:n_nodes) {
node_data <- dat$Node[[i]]
i_node_cid <- NA
# test if there are any counts ... if so look for CIDs ...
if (!is.null(node_data$Information$Counts)) {
for (j in 1:length(node_data$Information$Counts)) {
type <- node_data$Information$Counts[[j]]$Type
if (type=="CID" || type=="Compound") {
i_node_cid <- j
}
}
}
# if there's CIDs, save the count
if (!is.na(i_node_cid)) {
node_nCID <- node_data$Information$Counts[[i_node_cid]]$Count
node_nCIDs[i] <- node_nCID
} #otherwise remains at 0
i_node_nCIDs[i] <- i_node_cid
#node_nCIDs[i] <- node_nCID
}
# i_node_nCIDs <- which(unlist(lapply(dat$Node, function(i) !is.null(i$Information$Counts[[1]]$Count))))
# node_nCIDs <- unlist(lapply(dat$Node, function(i) i$Information$Counts[[1]]$Count))
nodeIDs <- unlist(lapply(dat$Node, function(i) i$NodeID))
parentIDs <- unlist(lapply(dat$Node, function(i) i$ParentID))
#sapply(NodeText, function(x)dat$Node[[x]]$Information$Name)
#dat$Node[[]]
# hid_info <- cbind(nodeNames,nodeHNID,node_nCIDs,nodeIDs,parentIDs)
# hid_info <- as.data.frame(cbind(nodeNames,nodeHNID,nodeIDs,parentIDs))
hid_info <- as.data.frame(cbind(nodeNames,nodeHNID,nodeIDs,parentIDs,node_nCIDs))
# now add in global data
hid_info$hid <- hid
hid_info$SourceName <- source_name
hid_info$SourceID <- sourceID
hid_info$HNID <- hnid
hid_info$nCIDs <- n_cids
# # now add nCIDs
# hid_info$node_nCIDs <- 0
#
# for (i in 1:length(i_node_nCIDs)) {
# index <- i_node_nCIDs[i]
# hid_info$node_nCIDs[index] <- node_nCIDs[i]
# }
#do some reordering
# hid_info <- hid_info[,c(5:8,1:4,9)]
hid_info <- hid_info[,c(6:10,1:5)]
if (!is.na(depth)) {
export_name <- paste0("classification_tree_hid",query,"_depth",depth,"_export.csv")
} else {
export_name <- paste0("classification_tree_hid",query,"_export.csv")
}
#export_name <- paste0("classification_tree_hid",query,"_export.csv")
write.csv(hid_info,export_name,row.names = F)
if(is.null(hid_info)){
return(NA)
} else{
return(export_name)
}
}
......@@ -90,3 +90,21 @@
author = {Mayfield, John}
}
@misc{pubchem_pfas_metfrag_2022,
title = {{PubChem} {OECD} {PFAS} {Larger} {PFAS} {Parts} file for {MetFrag}},
copyright = {Creative Commons Attribution 4.0 International, Open Access},
url = {https://zenodo.org/record/6385954},
abstract = {This is a MetFrag database file constructed from the "Molecule contains PFAS parts larger than CF$_{\textrm{2}}$/CF$_{\textrm{3}}$" subnode of the OECD PFAS Definition node in the PFAS and Fluorinated Organic Compounds in PubChem Tree on the Classification Browser in PubChem. This file was constructed by downloading the node contents, selecting the columns of interest, changing the headers to MetFrag-compatible headers and adding exact mass, PubMed ID (PMID) counts and patent counts to the file (via this package). Entries containing Xe and Pr were removed. The construction of the tree is documented here (work in progress).},
urldate = {2022-03-26},
publisher = {Zenodo},
author = {{Schymanski, Emma} and {Bolton, Evan} and {Chirsir, Parviel} and {Kondic, Todor} and {Thiessen, Paul} and {Zhang, Jian}},
month = mar,
year = {2022},
doi = {10.5281/zenodo.6385954},
note = {Version Number: hid120\_HNID5525061\_20220324
Type: dataset},
keywords = {MetFrag, PFAS},
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment