Commit 92f1e490 authored by Emma Schymanski's avatar Emma Schymanski
Browse files

Various improvements

... as suggested by (and/or discussed with) Evan.
parent 17c92dc9
......@@ -61,7 +61,7 @@ This document is organised into several sections, as follows:
|_Search via PubChem Search_ | [Go to heading](#pc-search) | 6 |
|_Interactions via Entrez_ | [Go to heading](#entrez) | 6 |
|_Interactions via PUG REST_ | [Go to heading](#pugrest) | 6 |
|Implementation | [Go to heading](#impl) | 7 |
|Further Details | [Go to heading](#details) | 7 |
|Statements and References | [Go to heading](#statements) | 8 |
<!-- |References | [Go to heading](#statements) | 6 | -->
......@@ -105,7 +105,7 @@ CF~3~ groups (5.4 M), these were separated into individual sections.
Note that here, "**PFAS part**" is used to describe a connected portion of
the molecule that satisfies the OECD PFAS definition. A given molecule may have
more than one PFAS parts present, some examples are given in Figure 2,
along with the count of parts.
along with the count of parts. For further information, see [Details](#details).
![Examples of molecules with varying PFAS parts highlighted, drawn using CDK Depict [@mayfield_cdk].](fig/PFAS_parts_CDK.png)
......@@ -135,8 +135,8 @@ PFAS part_"), if not, a list of the possibilities is given directly
(Figure 4, middle left, "_Contains isolated unsaturated-cyclic part_").
The "_Contains only isolated CF~2~_" (or, for the CF~3~ node, only isolated
CF~3~) node (Figure 4, middle panel) is broken down by the number of
isolated groups (CF~2~ or, for the CF~3~ node, by CF~3~ groups). In both
CF~3~) is broken down by the number of isolated groups (CF~2~ or,
for the CF~3~ node, by CF~3~ groups) - see Figure 4, middle panel. In both
cases the vast majority of molecules have only one isolated group.
The "_Contains only isolated CF~2~/CF~3~_" is also broken down by
......@@ -197,8 +197,9 @@ breakdown by "_Also contains..._".
![The "Molecule contains PFAS parts larger than CF~2~/CF~3~" part of the OECD PFAS Definition node, with dynamic breakdown of subnodes by isolated PFAS part type (numbers from 24 March 2022).](fig/OECDPFAS_PFAS_part_type.png)
The dynamic design reduces the scrolling among users and also helps reduce
the load time for large parts of the tree. It is possible to use some
The dynamic navigation approach reduces the scrolling by users and
also helps reduce the data loading time, when many entries are
present within a node. It is possible to use some
advanced search and querying capabilities to improve the interaction
with the tree, see [Navigating the Tree](#search) below.
......@@ -206,8 +207,8 @@ The _PFAS Parts Larger than CF~2~/CF~3~_ is available as
a [MetFrag](https://msbi.ipb-halle.de/MetFrag/) file for further use
[@pubchem_pfas_metfrag_2022]. The CSV can be downloaded from Zenodo
(DOI:[10.5281/zenodo.6385954](https://doi.org/10.5281/zenodo.6385954))
for use in MetFragCL, it should be available from the MetFragWeb
drop down menu soon. See the description on the Zenodo record for
for use in MetFragCL and will be made available from the MetFragWeb
drop down menu. See the description on the Zenodo record for
more details.
......@@ -224,30 +225,34 @@ Further content still to come ...
### PFAS and Fluorinated Organic Compound Collections {#lists}
This section of the PFAS tree gathers together content from various
locations from within PubChem, with the possibility to add some
external content. The mapping files to construct this are kept
This section of the PFAS tree contains various lists gathered
across PubChem content. Additional community-based PFAS lists may
also be added here. The mapping files to construct this are kept
on the [eci/pubchem](https://gitlab.lcsb.uni.lu/eci/pubchem/)
repository on GitLab.
Currently, the content (see Figure 7) comes from:
- all [PFAS lists](https://comptox.epa.gov/dashboard/chemical-lists?filtered=&search=PFAS)
- All [PFAS lists](https://comptox.epa.gov/dashboard/chemical-lists?filtered=&search=PFAS)
from the
[CompTox Chemicals Dashboard](https://comptox.epa.gov/dashboard/)
[@williams_comptox_2017];
- all [PFAS lists](https://zenodo.org/communities/norman-sle/search?page=1&size=20&q=PFAS)
[@williams_comptox_2017] via the
[EPA DSSTox Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=105)
in PubChem;
- All [PFAS lists](https://zenodo.org/communities/norman-sle/search?page=1&size=20&q=PFAS)
from the NORMAN Suspect List Exchange
([NORMAN-SLE](https://www.norman-network.com/nds/SLE/))
- the CORE PFAS lists from OntoChem [@barnabas_extracting_2022]
([NORMAN-SLE](https://www.norman-network.com/nds/SLE/)) via the
[NORMAN-SLE Tree](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101)
in PubChem;
- The CORE PFAS lists from OntoChem [@barnabas_extracting_2022];
- Other collections from within PubChem Classification Trees, including
[Cameo](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=86),
[ChEBI](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=2) and
[MeSH](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=1)
[MeSH](https://pubchem.ncbi.nlm.nih.gov/classification/#hid=1).
![The "PFAS and Fluorinated Organic Compound Collections" node, with all major collections shown (status 24 March 2022).](fig/PFAS_list_of_lists.png)
![The "PFAS and Fluorinated Organic Compound Collections" node, with all major collections shown (as of 24 March 2022).](fig/PFAS_list_of_lists.png)
## Navigating the Tree {#search}
......@@ -335,12 +340,17 @@ write.csv(PFAS_tree, file="PubChem_PFAS_Tree_Details.csv",row.names = F)
## Implementation {#impl}
## Further Details {#details}
Content still to come ...
- test set
- notes on implementation
- Exclude molecules from consideration if:
- Is a mixture (i.e., has multiple components, which includes any salts)
- Contains a radical or isotopically labelled atom
- [molecules with non-organic elements not currently being removed]
......
No preview for this file type
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment