*** Task 1: Data Retrieval
* Finding the unique TPs in Anliker and S66 EAWAGTPs
• Download *S66 | EAWAGTPs | Parent – Transformation Products* from EAWAG data file from [[][Zenodo]]
• Data file - *Parent – TPs – EawagTPandParents.csv*
* Download the supplementary file containing the list of target analytes (*es9b07085_si_002.xlsx*) from [[][Anliker,]]
* The supplementary file contains – Substance name, CAS No., Molecular formula and Class of the TPs (the file lacks information about the parent compound from which the TP was derived)
*** Task 2: CompTox Batch Search
* Anliker data has only CAS-Nos and no *Pubchem CIDs*,hence CAS-Nos. were used as input for [[][CompTox Batch Search]]
* Select Input Type(s) – *CASRN* > select Download Chemical Data
* Select Output Format (Excel) and download the file containing the following parameters:
• DTXCID, DTXSID, Chem Name, CAS-RN, INCHIKEY, IUPAC, SMILES, INCHISTRING, Molecular formula and Monoisotopic mass
*** Task 3: Finding out PubChem CIDs of Anliker TPs
* Use *CASRN* from the output of batch search as an input in [[][PubChem Identifier Exchange]] to get CIDs
* Input ID lists – Choose *Synonyms* and provide CAS-Nos. SMILES and INCHIs can also be given as input
* For the Operator type select *Same CID*
* For the Output ID select *CID*
* Select *Entrez History* in Output method and click *Submit job*
*** Task 4: Identifying unique TPs
* Data retrieval of existing TPs from PubChem Classification Browser
• TP information of Swiss Pesticides and Metabolites (S60), EawagTPs (S66), HSDBTPs (S68), LUXPEST(S69) and REFTPs (S74) were obtained from [[][Norman Suspect List Exchange Classification]] of the PubChem Classification Browser
/(Image obtained from FAIRTP documentation- Emma and Adelene)/
* In the above figure, under S66|EAWAGTPS, the Transformation Products section has 158 TPs. Select the blue icon showing 158 and it will display the transformations in a new window (see the image below)
/(Image obtained from FAIRTP documentation- Emma and Adelene)/
* The annotation section of the TPs can be found under the [[][Use and Manufacturing]] section of the PubChem compound page
/(Image obtained from FAIRTP documentation- Emma and Adelene)/
* If there is a transformation in a particular compound, the TP information can be directly found under the [[][Transformations]] section of the PubChem compound page
/(Image obtained from FAIRTP documentation- Emma and Adelene)/
* For a complete list of Pharmacology and BioChemistry TPs, then complete list of it can be obtained from [[][PubChem Compound TOC]] of the PubChem Classification Browser
* Removing duplicate TPs between Anliker and existing TPs of Norman Suspect List Exchange
• CIDs was used as a field to identify unique TPs and duplicates were removed
*** Challenges:
Overall Anliker dataset cannot be considered as a ‘good primary reference’ owing to some of the reasons listed below:
* Missing CAS-Nos in Anliker list as a result of CompToX batch search (most of them had to be manually filled)
* Substance/compound names were different from PubChem and most of them were misspelt
* Major issue – only TP information were provided in Anliker dataset and no parent information (missing links) and lot of assumptions has to be made to fill the gaps
