Skip to content
Snippets Groups Projects
Todor Kondić's avatar
Todor Kondic authored
1bcbe093
History
Name Last commit Last update
R
inst/www
man
DESCRIPTION
LICENSE.md
NAMESPACE
README.org

The Shinyscreen Package

Overview

Shinyscreen R package is an application intended to give the user a first look into raw mass-spectrometry data. This currently means that, given the input of data files and a list of masses of know, or unknown compounds, the application is going to produce the MS1 and MS2 chromatograms of the substances in the list, as well as the MS2 spectra. None of these features have been post-processed in the slightest. However, there is a built-in prescreening aid that will help the user assess the quality of the spectra.

The application is powered by the MSnbase package and built as a Shiny web application.

Installation

Docker (TODO)

Straight from Gitlab(TODO)

Rtools Installation (Windows users reported this as a necessary step)

Use installr package to download and install Rtools.

install.packages("installr")   
library("installr")
install.Rtools()

This will automatically download Rtools and proceed with the default installation process.

R is not RStudio(tm)

In order to avoid all the fuss with credentials on github and gitlab, just download the package and use devtools to install. Here is a script that installs.

## ***** INSTALLATION OF THE PACKAGE AND ITS DEPENDENCIES (BEGIN) *****
library(devtools)
SSCREEN_LOC<-"/this/is/where/I/downloaded/shinyscreen/source"
install_deps(SSCREEN_LOC,
             upgrade = "never")
install_local(SSCREEN_LOC,
              upgrade = "never")
## ***** INSTALLATION OF THE PACKAGE AND ITS DEPENDENCIES (END) *****

This installation procedure should work in most situations. If you want to just try out the package, without installing it to some system path, you could create .Renviron in a `project’ directory with the following content.

R_LIBS_USER=local

Then, create the folder `local` inside the project directory. After this is done, start R and make sure that `local’ directory is in the current package path.

.libPaths() # This should list `local'.

If it is, run the installation script above, and you should have shinyscreen and its dependencies inside the `local’ directory.

Minor Dependency Hell

Perhaps there is a collision between dependencies you already have on your computer and what the devtools want to install. One way to avoid this is to follow the `.Renviron` approach from above, but the other is to try and avoid installing dependencies alltogether.

SSCREEN_LOC<-"/this/is/where/I/downloaded/shinyscreen/source"
library(devtools)
devtools::install_local(path=SSCREEN_LOC,force=T,dependencies=FALSE)

Major Dependency Hell

Among Shinyscreen dependencies, mzR and rcdk have been known to cause major installation issues. To install the dependencies use the following script

BiocManager::install(c("curl","rsvg","enviPat","rJava", "fingerprint", "png", "rcdk","mzR","rcdklibs"), dependencies=TRUE))

The major issue in installing the dependencies is loading rjava library (essential rcdk functionality)

  • Install the latest Java versions: Download
  • Ensure 32 bit and 64 bit versions are available for 64 bit systems. In case of Windows, check in `C:\Program Files\Java` and `C:\Program Files (x86)\Java`.
  • Test if the library is loaded properly and proceed with the installation of RMassBank and RChemMass.

Detailed explanation on how to tackle these problems is available here.

Running Shinyscreen

Provided Shinyscreen is successfully installed this snippet will run it.

library(shinyscreen)
PROJECT="project/location/somewhere/on/my/storage/device"
launch(projDir=PROJECT) 

The `projDir` argument can be left out in which case shinyscreen is going to assume that the project directory is the result of

## Get current working directory of R instance.
getwd()

So, what is the project directory? This is the place where shinyscreen state, log and output files go by default. In other words, if you produce some PDF plots, this is where they are going to end up.

Usage

Before Starting

Compound Lists

The lists of known and unknown compounds contain different information and are treated differently. The application needs at least one, but can take both known and unknown lists as inputs. The formats of both lists are explained below.

Known Compounds List

  • A comma-separated CSV file table.
  • The column names are case-sensitive.
  • Required headers:
    ID
    This is an integer compound identifier. This column must be filled and each ID entry must be unique. If both unknown and known lists are given, IDs from both lists must not overlap.
    SMILES
    The SMILES character string. Shinyscreen accepts

    only MS-Ready SMILES. This column must be filled.

    Name
    The compound name. This column can be left empty.
    RT
    The retention time of the peak in minutes. This column can be left empty.
  • Optional headers:
    mz
    m/z mass of the compound. If both SMILES and mz entries are present for a given compound, mz takes precedence.
	"ID","Name","SMILES","RT"
	 33,"Isoproturon","CC(C)C1=CC=C(NC(=O)N(C)C)C=C1",19.6
       717,"epsilon-Decalactone","CCCCC1CCCCC(=O)O1",
        67,,"CCCCC1CCCCCC(=O)O1",
       ...,...,...,...

It is strongly suggested to quote all the character strings, such as SMILES and Name.

Unknown Compounds List

  • A comma-separated CSV file table.
  • Required headers:
    ID
    This is an integer compound identifier. This column must be filled and each ID entry must be unique. If both unknown and known lists are given, IDs from both lists must not overlap.
    mz
    m/z mass of the compound.
    RT
    The retention time of the peak in minutes. This column can be left empty.
"ID","mz","RT"
 22,296.1160,
888,503.2816,

The compound sets.

Shinyscreen organises its data around the concept of compound sets. If, given a collection of data files, it is possible to break down the compounds into logical groups, shinyscreen will make it easier to navigate different groups if this is specified in a CSV list. In this case, the CSV file contains two columns: ID and set. The ID is the identifier of the compound from the compound list and set is a name of the set. If there is no sensible way of splitting compounds in groups, it is enough to copy all the ID-s from the compound list into a new CSV and use any character string to fill out the set column.

ID set RT
33 mixA
717 mixA
999 mixA
129 mixB
516 mixB
333 mixC
999 mixC

Data Files

These should be in mzML format.

Sets, Tags, Modes, Files and IDs

Each file is labelled by a tag, mode and set. Sets are defined in the compound set CSV file and group compounds according to their IDs. Modes correspond to the adducts. Tags label files in the plots.

For known compounds, each set can contain multiple modes. Sets of unknowns can only contain a single mode. Any files belonging to the same set that have been acquired in a single mode, must carry unique tags.

In addition, the IDs of compounds belonging to the same set/mode combination must be unique. Different ID sets may overlap.

Config Screen

This is the start tab. Import the compound and set lists first, then proceed to import the mzML files. Provide tags in the tag text box and then assign the sets, modes and tags to the imported mzML files using table widget. Once this is done, move on to the `Spectra Extraction’ tab.

Spectra Extraction

Set the extraction parameters and then select a certain number of sets to scan for. This may take a while.

After one, or more sets have been extracted (once the status box gets checked), it is possible to carry out the auto quality check. This check is going to perform a rudimentary analysis of the spectra, as well as retrieve the retention times of the precursor peaks and their MS2 spectra. This procedure must be done in order to plot the MS2 spectra.

TODO: Explain the parameters

For entries that had RT empty, the entire retention time interval is scanned for peaks. Those entries with known RT will only be scanned within the interval specified by the parameters (by default 1 min). This means that the processing is going to take much less time then for the case if RT was left out.

Prescreening

The third tab allows the visual inspection of the spectra and the chromatogram, as well as exporting the plots in a PDF format.