Commit f6ee07da authored by Emma Schymanski's avatar Emma Schymanski
Browse files

More doc tweaks

... images updated with latest versions from test, text adjustments and some clean-up.
parent ae50bc23
......@@ -3,7 +3,7 @@ title: "PFAS and Fluorinated Compounds in PubChem Tree"
author:
- "Emma L. Schymanski^1^*, Parviel Chirsir^1^, Todor Kondic^1^,"
- "Paul A. Thiessen^2^, Jian Zhang^2^ and Evan E. Bolton^2^*"
date: "28/05/2022"
date: "29/05/2022"
output: pdf_document
csl: journal-of-cheminformatics.csl
bibliography: refs.bib
......@@ -53,7 +53,7 @@ The
(see [Figure 1](#treenodes) and [Contents listing](#cont))
includes all compounds in [PubChem](https://pubchem.ncbi.nlm.nih.gov/)
satisfying various definitions, as explained later in this document.
Each compound in PubChem has a PubChem Compound Identifier (CID), and the
Note that each compound in PubChem has a PubChem Compound Identifier (CID), and the
blue numbers next to each node header reflects the number of
compounds (_i.e._ CIDs) in that node.
......@@ -90,28 +90,19 @@ Table: _Contents list for the PubChem PFAS Tree documentation._
|References | [Go to heading](#refs) | 14 |
<!-- To become more familiar with the PubChem Classification Browser features -->
<!-- in general before embarking on content specific to the PFAS tree, -->
<!-- see Section [Navigating the Tree](#search). -->
<!-- There is also extensive documentation on the PubChem website, see: -->
<!-- - https://pubchem.ncbi.nlm.nih.gov/classification/ -->
<!-- - https://pubchemdocs.ncbi.nlm.nih.gov/classification-browser -->
<!-- - https://pubchem.ncbi.nlm.nih.gov/classification/docs/classification_help.html -->
## PubChem PFAS Tree Nodes {#treenodes}
The tree is currently split into three main nodes that are constructed and
The tree is currently split into four main nodes that are constructed and
compiled separately (see [Figure 1](#treenodes)).
More nodes are under development and will be released as they are ready.
Further details are given below.
<!-- To become more familiar with the PubChem Classification Browser features, -->
<!-- see Section [Navigating the Tree](#search). -->
Further details about each of the nodes are given below.
To become more familiar with the PubChem Classification Browser features,
see Section [Navigating the Tree](#search).
![_The "[PFAS and Fluorinated Compounds in PubChem Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120)" Landing Page._](fig/PFAS_Tree_Landing.png)
![_The "[PFAS and Fluorinated Compounds in PubChem Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120)" Landing Page (29 May 2022)._](fig/PFAS_Tree_Landing.png)
<!-- TODO: Update Figure 1 -->
### OECD PFAS Definition {#oecddef}
......@@ -130,12 +121,11 @@ along with the count of parts. For more information, see section
Browsing the 6 million entries in this node (see Figure 3) is challenging.
Since most of these PFAS contain isolated CF~2~ (600 K entries) or
CF~3~ groups (5.4 M entries), these were separated into individual sections
<!-- (see "[_isolated CF~2~ and CF~3~_](#isonodes)"). -->
(see [next section](#isonodes)).
~188 K compounds contain PFAS parts larger than CF~2~/CF~3~
(see "[_isolated CF~2~ and CF~3~_](#isonodes)").
<!-- (see [next section](#isonodes)). -->
Approximately 188 K compounds contain PFAS parts larger than CF~2~/CF~3~
(see "[larger PFAS parts](#largerparts)").
<!-- #### PFAS "parts": -->
![_Examples of molecules with varying PFAS parts highlighted, drawn using [CDK Depict](https://www.simolecule.com/cdkdepict/depict.html) [@mayfield_cdk]._](fig/PFAS_parts_CDK.png)
......@@ -143,9 +133,8 @@ CF~3~ groups (5.4 M entries), these were separated into individual sections
The _OECD PFAS Definition_ node,
with the top two level subnodes, is shown in Figure 3.
![_The OECD PFAS Definition part of the PFAS tree, with top two subnodes (24 March 2022)._](fig/OECDPFAS_TopTwoSubnodes_v4.png)
![_The OECD PFAS Definition part of the PFAS tree, with top two subnodes (29 May 2022)._](fig/OECDPFAS_TopTwoSubnodes.png)
<!-- TODO: update Figure 3 -->
### OECD PFAS - Isolated CF~2~ and CF~3~ Nodes {#isonodes}
......@@ -240,7 +229,7 @@ a [MetFrag](https://msbi.ipb-halle.de/MetFrag/)
(DOI: [10.5281/zenodo.6385954](https://doi.org/10.5281/zenodo.6385954))
for use in
[MetFragCL](https://ipb-halle.github.io/MetFrag/projects/metfragcl/)
and will be made available from the
and is available from the
[MetFragWeb](https://msbi.ipb-halle.de/MetFrag/)
dropdown menu. This file contains several useful fields
from the [Download](#pc-search) file as well as Patent and Literature
......@@ -280,21 +269,14 @@ The exact mass subcategories are split into the ranges
1-250, 250-500, 500-750, 750-1000 and >1000 - and are only present
if there are CIDs within this range.
<!-- The exact mass is split as follows: -->
<!-- - Exact mass range 1-250 -->
<!-- - Exact mass range 250-500 -->
<!-- - Exact mass range 500-750 -->
<!-- - Exact mass range 750-1000 -->
<!-- - Exact mass range >1000 -->
### Other Diverse Fluorinated Compounds {#divf}
The "_Other Diverse Fluorinated Compounds_" section of the
PubChem PFAS Tree is designed to help users explore various
cases of fluorine chemistry not necessarily covered in the OECD PFAS
or Organofluorine compound sections above. The navigation in this
[PubChem PFAS Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120)
is designed to help users explore various
cases of fluorine chemistry not necessarily covered in the [OECD PFAS](#oecddef)
or [Organofluorine compound](#orgf) sections above. The navigation in this
section helps explore fluorinated compound chemistry by various
fluorine-heteroatom bonds and the occurrence of different elements
(see Figure 8).
......@@ -304,7 +286,7 @@ in the other sections of the PubChem PFAS Tree - the overlap
can be investigated in Entrez (see section
[Interactions via Entrez](#entrez) below).
![_The "Other diverse fluorinated compounds" part of the PubChem PFAS Tree, showing the breakdown by fluorine bonded to non-carbon elements and by non-organic element (interim numbers from 27 May 2022)._](fig/DiverseFcmpds_v2.png)
![_The "Other diverse fluorinated compounds" part of the PubChem PFAS Tree, showing the breakdown by fluorine bonded to non-carbon elements and by non-organic element (numbers from 29 May 2022)._](fig/DiverseFcmpds.png)
#### The "Contains fluorine bond to non-carbon element"
......@@ -335,7 +317,7 @@ The mapping files to construct this are kept
on the [eci/pubchem](https://gitlab.lcsb.uni.lu/eci/pubchem/)
repository on GitLab.
![_The "PFAS and Fluorinated Organic Compound Collections" node, with all major collections shown (CompTox as inset). Numbers and content listing from 24 March 2022._](fig/PFAS_list_of_lists.png)
![_The "PFAS and Fluorinated Compound Collections" node, with all major collections shown (CompTox and OntoChem as insets). Numbers and content listing from 29 May 2022._](fig/PFAS_list_of_lists.png)
Currently, the content displayed in Figure 9 comes from:
......@@ -361,7 +343,7 @@ Additional community-based PFAS can also be added to this section.
We will be happy to add new collections where feasible.
If you have any suggestions, please email
[pubchem-help@ncbi.nlm.nih.gov](mailto:pubchem-help@ncbi.nlm.nih.gov) or
[normansle@uni.lu](mailto:normansle@uni.lu)) for further details.
[normansle@uni.lu](mailto:normansle@uni.lu) for further details.
## Navigating the Tree {#search}
......@@ -520,6 +502,8 @@ please see the following locations in the PubChem documentation:
- https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest$classification_nodes
#### Interacting with the PubChem PFAS Tree in R
The following contains a few tips to start interacting with the tree in R;
note that some of these features are also in active development.
......@@ -594,7 +578,7 @@ will be expanded as further questions arise.
The [PubChem PFAS Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120)
currently excludes molecules (compounds) from consideration if they:
- are a mixture (i.e., has multiple components, which includes any salts);
- are a mixture (i.e., contain multiple components, including any salts);
- contain a radical or isotopically labelled atom.
Since the entire tree is constructed on CIDs (_i.e._, compounds), substance
......@@ -617,12 +601,13 @@ requested (and if reasonably possible).
### Future plans
The current approach still has room for improvement; the following are being addressed
The [PubChem PFAS Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120)
is undergoing active development.
The following are being addressed
in future developments (and will be released as ready). These include:
- Handling of ethers and other connecting atoms;
- Handling of unsaturated PFAS.
<!-- - Better browseability of special cases. -->
## Contact Details
......@@ -634,8 +619,9 @@ or email [Evan](mailto:evan.bolton@nih.gov) and
with feedback and comments!
If you have any suggestions for PFAS or fluorinated compound collections to
include in the "_PFAS and Fluorinated Compound Collections_" part
of the PubChem PFAS Tree, please email
include in the "_PFAS and Fluorinated Compound Collections_" section of the
[PubChem PFAS Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=120),
please contact us at
[pubchem-help@ncbi.nlm.nih.gov](mailto:pubchem-help@ncbi.nlm.nih.gov) or
[normansle@uni.lu](mailto:normansle@uni.lu).
......@@ -646,8 +632,6 @@ described here, please reach out to the
for further support.
<!-- ## Closing -->
## Statements {#statements}
......
No preview for this file type
pfas-tree/fig/DiverseFcmpds.png

615 KB | W: | H:

pfas-tree/fig/DiverseFcmpds.png

569 KB | W: | H:

pfas-tree/fig/DiverseFcmpds.png
pfas-tree/fig/DiverseFcmpds.png
pfas-tree/fig/DiverseFcmpds.png
pfas-tree/fig/DiverseFcmpds.png
  • 2-up
  • Swipe
  • Onion skin
pfas-tree/fig/PFAS_Tree_Landing.png

314 KB | W: | H:

pfas-tree/fig/PFAS_Tree_Landing.png

360 KB | W: | H:

pfas-tree/fig/PFAS_Tree_Landing.png
pfas-tree/fig/PFAS_Tree_Landing.png
pfas-tree/fig/PFAS_Tree_Landing.png
pfas-tree/fig/PFAS_Tree_Landing.png
  • 2-up
  • Swipe
  • Onion skin
pfas-tree/fig/PFAS_list_of_lists.png

956 KB | W: | H:

pfas-tree/fig/PFAS_list_of_lists.png

987 KB | W: | H:

pfas-tree/fig/PFAS_list_of_lists.png
pfas-tree/fig/PFAS_list_of_lists.png
pfas-tree/fig/PFAS_list_of_lists.png
pfas-tree/fig/PFAS_list_of_lists.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment