Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • eci/shinyscreen
  • miroslav.kratochvil/shinyscreen
2 results
Show changes
Commits on Source (67)
Showing with 1220 additions and 1468 deletions
......@@ -44,9 +44,6 @@ variables:
base-image:
tags:
- docker
- $RUNNER_TAG
stage: dep_images
rules:
- if: $CI_COMMIT_TAG == ""
......
Package: shinyscreen
Title: Pre-screening of Mass Spectrometry Data
Version: 1.3.19
Version: 1.3.21
Author: Todor Kondić
Maintainer: Todor Kondić <todor.kondic@uni.lu>
Authors@R:
......@@ -50,6 +50,7 @@ Collate:
'errors.R'
'mix.R'
'envopts.R'
'data-model.R'
'state.R'
'metfrag.R'
'plotting.R'
......
---
title: "shinyscreen Docker Documentation"
author: "Anjana Elapavalore"
date: '`r Sys.Date()`'
output: pdf_document
---
The objective of this documentation is to provide a detailed procedure for running the shinyScreen application within a Docker container. shinyScreen is a web-based tool developed for visualizing and analyzing high-throughput screening data using Shiny, a web application framework for R. Docker is employed to containerize the application, ensuring reproducibility and simplifying the deployment process by packaging the application with all its dependencies.
This document provides step-by-step instructions on how to install Docker, pull the ShinyScreen Docker image, and run the application.
## Prerequisites
Before proceeding, the following prerequisites must be met:
- Docker Desktop must be installed and properly configured on the system.
- Sufficient system resources, including at least 4 GB of RAM, are recommended for optimal performance.
## Step 1: Install Docker Desktop on Windows
If Docker Desktop is not already installed on your system, follow these steps to set it up:
1. **Download Docker Desktop**:
- Visit the [Docker Desktop for Windows](https://docs.docker.com/desktop/install/windows-install/) download page and download the installer.
2. **Install Docker Desktop**:
- Run the downloaded installer.
- Follow the installation prompts, and make sure the option to "Use the WSL 2-based engine" is selected during setup (if available).
3. **Verify Docker Installation**:
- After installation, launch Docker Desktop.
- Ensure Docker is running by checking the Docker icon in the system tray (it should be active).
- Open a command prompt (CMD) or PowerShell and type the following command to check the Docker version:
```bash
docker --version
```
This should return the installed Docker version.
## Step 2: Search for the ShinyScreen Docker Image in Docker Desktop
Docker Desktop provides a graphical interface to search for Docker images. To search and pull the **shinyScreen** Docker image:
1. **Open Docker Desktop**:
- Launch Docker Desktop from the Windows Start menu or by clicking the Docker icon in the system tray.
2. **Search for the ShinyScreen Image**:
- In Docker Desktop, navigate to the **search** tab as shown in the image below.
**Docker Image : anjanae/shinyscreen:v21_latest**
```{r, echo=FALSE, out.width="70%"}
knitr::include_graphics("images/searchtab.png")
```
3. **Pull the Image**:
- Click on the **Pull** button next to the desired ShinyScreen image. This will download the image to your local system.
You can monitor the download progress in Docker Desktop’s **Images** tab.
## Step 3: Run the ShinyScreen Docker Container
Once the ShinyScreen image has been pulled successfully, the next step is to run the container. This can be done directly from Docker Desktop or via the command line:
### Running the Container via Docker Desktop:
1. **Navigate to the Images Tab**:
- In Docker Desktop, go to the **Images** tab where your downloaded ShinyScreen image will be listed.
2. **Launch the Container**:
- Click the **Run** button next to the ShinyScreen image.
- In the pop-up dialog, ensure that port `3838` is exposed. To do this, add the following port configuration in the **Optional Settings**:
```
Host Port: 3838
Container Port: 3838
```
3. **To create the volume mounts**:
- To share files between the host and the container, you can use Docker’s volume mounting feature.
### local Directory Structure
Before launching the Docker container, the following directory structure should be set up on your local system. These directories will be mounted to specific locations inside the container, enabling the application to access the required input data and store the output files.
### Root Directory
The **root directory** is the top-level folder where the entire project is organized. This root directory will contain the following subdirectories:
1. **Project Directory**
2. **Data Directory**
3. **MetFrag Databases Directory**
Below is an overview of each subdirectory and its purpose.
### Project Directory
The **Project Directory** should initially contain only the **compound list** file, which is an essential input for the ShinyScreen application.
After the application runs, this directory will also store various output files generated during the analysis.
### Data Directory
The **Data Directory** should contain only the **mzML files**. These files serve as the primary data input for the ShinyScreen application and represent mass spectrometry data in mzML format.
### MetFrag Databases Directory
The **MetFrag Databases Directory** must include the latest **PubChemLite_Exposomics.csv** file, which will be used by the ShinyScreen application for MetFrag analysis.
### Configure Volume Mounts in Docker Desktop
Now that the local directories are set up, you can configure **volume mounts** in Docker Desktop to map these local directories to the corresponding paths inside the Docker container.
For the volume mounts, each directory on the host machine (local system) will be mapped to a specific location within the Docker container, allowing the application to access the input data and save output files.
In the **Optional Settings** pop-up window, you will configure the following **three volume mounts** (Please refer to the table and image below):
- **Local Directory on Windows**:
- This is the folder on your local machine where the data will reside.
- **Container Directory**:
- This is the path inside the Docker container where the files will be mounted.
| Local Directory (Host) | Container Directory (Mount) | Description |
| ------------------------------ | -------------------------------------- | ------------------------------- |
| `C:\path\to\rootdirectory` | `/home/ssuser/projects` | Directory for project files. |
| `C:\path\to\rootdirectory` | `/home/ssuser/top_data_dir` | Directory for top-level data. |
| `C:\path\to\metfrag_dbs` | `/home/ssuser/metfrag_dbs` | Directory for MetFrag databases. |
```{r, echo=FALSE, out.width="70%"}
knitr::include_graphics("images/runtab.png")
```
By setting up the volume mounts, any changes made to files in the mapped directories on the host machine will automatically reflect inside the container.
Once the **port** and **volume configurations** are confirmed, click **Run** to start the container. The ShinyScreen application will now be running inside the container.
Now click on the port mapping to access the application in the web browser
```{r, echo=FALSE, out.width="85%"}
knitr::include_graphics("images/containertab.png")
```
File added
Docker_Documentation/images/containertab.png

198 KiB

Docker_Documentation/images/runtab.png

97.9 KiB

Docker_Documentation/images/searchtab.png

141 KiB

FROM gitlab.lcsb.uni.lu:4567/eci/shinyscreen/dep/ssuser:latest
MAINTAINER todor.kondic@uni.lu
EXPOSE 5432
EXPOSE 3838
ENV SS_MF_DB="PubChemLite_exposomics.csv"
ENV SS_CPU 2
ADD . shinyscreen/
RUN R CMD build shinyscreen
USER root
RUN R CMD INSTALL shinyscreen
USER ssuser
RUN cp shinyscreen/runme /home/ssuser/runme
RUN chmod u+x /home/ssuser/runme
# RUN chown ssuser /home/ssuser/runme
# RUN chown -R ssuser /home/ssuser/shinyscreen
RUN R -e 'library(shinyscreen);setwd("~");init(top_data_dir="~/top_data_dir",projects="~/projects",users_dir="~/users",metfrag_db_dir=Sys.getenv("SS_MF_DB_DIR"),metfrag_jar="/usr/local/bin/MetFragCommandLine.jar",no_structure_plots=T,save=T,merge=F)'
ENTRYPOINT ["/home/ssuser/runme"]
CMD ["app"]
......@@ -6,7 +6,6 @@ export(conf_trans)
export(create_plots)
export(create_stub_gui)
export(extr_data)
export(extract)
export(get_fn_comp)
export(get_fn_conf)
export(get_fn_extr)
......@@ -47,8 +46,6 @@ export(sort_spectra)
export(subset_summary)
export(tk_save_file)
import(data.table)
importFrom(MSnbase,filterMz)
importFrom(MSnbase,readMSData)
importFrom(promises,"%...>%")
importFrom(promises,future_promise)
importFrom(shiny,HTML)
......
......@@ -32,6 +32,7 @@ run <- function(envopts,
all_phases=list(setup=setup_phase,
comptab=mk_comp_tab,
db=make_db,
extract=extr_data,
prescreen=prescreen,
sort=sort_spectra,
......@@ -98,39 +99,14 @@ run_in_dir <- function(m) {
}
##' @export
load_compound_input <- function(m) {
coll <- list()
fields <- colnames(EMPTY_CMPD_LIST)
fns <- file.path(m$run$paths$project,m$conf$compounds$lists)
message("fns:",paste0(fns,collapse=","))
coltypes <- c(ID="character",
SMILES="character",
Formula="character",
Name="character",
RT="numeric",
mz="numeric")
for (l in 1:length(fns)) {
fn <- fns[[l]]
## Figure out column headers.
nms <- colnames(file2tab(fn,nrows=0))
## Read the table. Knowing column headers prevents unnecessary
## warnings.
dt <- file2tab(fn, colClasses=coltypes[nms])
verify_cmpd_l(dt=dt,fn=fn)
# nonexist <- setdiff(fnfields,fields)
coll[[l]] <- dt #if (length(nonexist)==0) dt else dt[,(nonexist) := NULL]
coll[[l]]$ORIG <- fn
}
cmpds <- if (length(fns)>0) rbindlist(l=c(list(EMPTY_CMPD_LIST), coll), use.names = T, fill = T) else EMPTY_CMPD_LIST
fns <- file.path(m$run$paths$project,m$conf$compounds$lists)
cmpds = join_compound_lists(fns)
## Process sets.
cmpds = process_cmpd_sets(cmpds)
dups <- duplicated(cmpds$ID)
dups <- dups | duplicated(cmpds$ID,fromLast = T)
dupIDs <- cmpds$ID[dups]
......@@ -219,8 +195,6 @@ mk_comp_tab <- function(m) {
smiforadd <- smiles[smiforadd,.(ID,SMILES,Formula,adduct),on=c("SMILES")]
data.table::setkey(smiforadd,"adduct","ID")
## FIXME: Why is Formula a list when there are no SMILES, instead
## of an empty string?
smiforadd[,Formula:=as.character(Formula)]
## Update the intermediate table with masses.
......@@ -336,181 +310,67 @@ mk_tol_funcs <- function(m) {
}
##' @export
extr_data <-function(m) {
message("Stage: extract")
if (is.null(m$conf$serial) || !m$conf$serial) {
extr_data_future(m)
} else {
message("(extract): Serial extraction.")
extr_data_serial(m)
}
}
extr_data_future <- function(m) {
## Reduce the comp table to only unique masses (this is because
## different sets can have same masses).
m$out$tab$data <- m$out$tab$comp[,head(.SD,1),by=BASE_KEY]
m$out$tab$data[,set:=NULL] #This column is meaningless now.
file <- m$out$tab$data[,unique(file)]
fpaths <- file.path(m$run$paths$data,file)
allCEs <- do.call(c,args=lapply(fpaths,function(fn) {
z <- MSnbase::readMSData(files=fn,msLevel = c(1,2),mode="onDisk")
fine = create_fine_table(m)
unique(MSnbase::collisionEnergy(z),fromLast=T)
}))
allCEs <- unique(allCEs)
allCEs <- allCEs[!is.na(allCEs)]
cols <-paste('CE',allCEs,sep = '')
vals <- rep(NA,length(cols))
m$out$tab$data[,(cols) := .(rep(NA,.N))]
file <- m$out$tab$data[,unique(file)]
ftags <- m$out$tab$data[,.(tag=unique(tag)),by=file]
fpaths <- file.path(m$run$paths$data,ftags[,file])
futuref <- m$future
tmp <- lapply(1:nrow(ftags),function(ii) {
fn <- fpaths[[ii]]
the_tag <- ftags[ii,tag]
message("(extract): Commencing extraction for tag: ", the_tag, "; file: ",fn)
tab <- as.data.frame(data.table::copy(m$out$tab$data[tag==the_tag,.(file,tag,adduct,mz,rt,ID)]))
## err_ms1_eic <- m$extr$tol$eic
## err_coarse_fun <- m$extr$tol$coarse
## err_fine_fun <- m$extr$tol$fine
## err_rt <- m$extr$tol$rt
err_coarse <- m$conf$tolerance[["ms1 coarse"]]
err_fine <- m$conf$tolerance[["ms1 fine"]]
dpath = m$run$paths$data
err_ms1_eic <- m$conf$tolerance$eic
err_rt <- m$conf$tolerance$rt
## Open all files.
fns = fine[,unique(file)]
lms = lapply(fns,function(fn) read_data_file(file.path(dpath,fn)))
names(lms) = fns
## Load all feature data.
lfdata = lapply(lms,get_fdata)
names(lfdata) = fns
missing_precursor_info <- m$conf$extract$missing_precursor_info
x <- futuref(extract(fn=fn,
tag=the_tag,
tab=tab,
err_ms1_eic=err_ms1_eic,
err_coarse = err_coarse,
err_fine= err_fine,
err_rt= err_rt,
missing_precursors = missing_precursor_info),
lazy = F)
## Extract MS1 chromatograms using "fine" tolerance.
cgram_ms1 = fine[,extr_cgrams_ms1(lms[[file]],.SD,lfdata[[file]]),
by="file",
.SDcols=c("iso_fine_min",
"iso_fine_max",
"rt_min",
"rt_max",
"precid")]
setkey(cgram_ms1,file,precid)
x
## Extract MS2 chromatograms.
})
## Create the "coarse" table. Parent masses are known with
## "coarse". We will prefilter our ms2 results based on that...
coarse = create_coarse_table(m)
msk <- sapply(tmp,future::resolved)
curr_done <- which(msk)
for (x in curr_done) {
message("Done extraction for ", future::value(tmp[[x]])$ms1$tag[[1]])
}
while (!all(msk)) {
msk <- sapply(tmp,future::resolved)
newly_done <- which(msk)
for (x in setdiff(newly_done,curr_done)) {
message("Done extraction for ", future::value(tmp[[x]])$ms1$tag[[1]])
}
Sys.sleep(0.5)
curr_done <- newly_done
cgram_ms2 = data.table(precid=integer(0),
ce=numeric(0),
scan=character(0),
idx=integer(0),
rt=numeric(0),
intensity=numeric(0))
## Extract MS2 spectra.
spectra = empty_spectra_table()
for (fn in names(lfdata)) {
rtab = relate_ms2_to_precid(coarse=coarse[.(fn),on=.(file)],
ms2=lfdata[[fn]]$ms2,
cgram_ms1=cgram_ms1[.(fn),
on=.(file)])
sptab = extract_spectra(lms[[fn]],rtab)
cgram_ms2 = rbind(cgram_ms2,rtab)
spectra = rbind(spectra,sptab)
}
ztmp <- lapply(tmp,future::value)
m$extr$ms1 <- data.table::rbindlist(lapply(ztmp,function(x) x$ms1))
m$extr$ms2 <- data.table::rbindlist(lapply(ztmp,function(x) x$ms2))
data.table::setkeyv(m$extr$ms1,BASE_KEY)
data.table::setkeyv(m$extr$ms2,c(BASE_KEY,"CE"))
fn_ex <- get_fn_extr(m)
timetag <- format(Sys.time(), "%Y%m%d_%H%M%S")
saveRDS(object = m, file = file.path(m$run$paths$project,FN_EXTR_STATE))
setkey(cgram_ms1,precid,rt)
setkey(cgram_ms2,precid,ce,rt)
setkey(spectra,precid,scan)
m$db$extr$cgm$ms1 = cgram_ms1
m$db$extr$cgm$ms2 = cgram_ms2
m$db$extr$spectra = spectra
m
}
extr_data_serial <- function(m) {
## Reduce the comp table to only unique masses (this is because
## different sets can have same masses).
m$out$tab$data <- m$out$tab$comp[,head(.SD,1),by=BASE_KEY]
m$out$tab$data[,set:=NULL] #This column is meaningless now.
file <- m$out$tab$data[,unique(file)]
fpaths <- file.path(m$run$paths$data,file)
allCEs <- do.call(c,args=lapply(fpaths,function(fn) {
z <- MSnbase::readMSData(files=fn,msLevel = c(1,2),mode="onDisk")
unique(MSnbase::collisionEnergy(z),fromLast=T)
}))
allCEs <- unique(allCEs)
allCEs <- allCEs[!is.na(allCEs)]
cols <-paste('CE',allCEs,sep = '')
vals <- rep(NA,length(cols))
m$out$tab$data[,(cols) := .(rep(NA,.N))]
file <- file.path(m$run$paths$data,m$out$tab$data[,unique(file)])
ftags <- m$out$tab$data[,.(tag=unique(tag)),by=file]
ftags[,path:=file.path(..m$run$paths$data,file)]
futuref <- m$future
tmp <- lapply(1:nrow(ftags),function(ii) {
fn <- ftags[ii,path]
the_tag <- ftags[ii,tag]
message("(extract): Commencing extraction for tag: ", the_tag, "; file: ",fn)
tab <- as.data.frame(data.table::copy(m$out$tab$data[tag==the_tag,.(file,tag,adduct,mz,rt,ID)]))
## err_ms1_eic <- m$extr$tol$eic
## err_coarse_fun <- m$extr$tol$coarse
## err_fine_fun <- m$extr$tol$fine
## err_rt <- m$extr$tol$rt
err_coarse <- m$conf$tolerance[["ms1 coarse"]]
err_fine <- m$conf$tolerance[["ms1 fine"]]
err_ms1_eic <- m$conf$tolerance$eic
err_rt <- m$conf$tolerance$rt
missing_precursor_info <- m$conf$extract$missing_precursor_info
x <- extract(fn=fn,
tag=the_tag,
tab=tab,
err_ms1_eic=err_ms1_eic,
err_coarse = err_coarse,
err_fine= err_fine,
err_rt= err_rt,
missing_precursors = missing_precursor_info)
message("Done extraction for ", x$ms1$tag[[1]])
x
})
ztmp <- tmp
m$extr$ms1 <- data.table::rbindlist(lapply(ztmp,function(x) x$ms1))
m$extr$ms2 <- data.table::rbindlist(lapply(ztmp,function(x) x$ms2))
data.table::setkeyv(m$extr$ms1,BASE_KEY)
data.table::setkeyv(m$extr$ms2,c(BASE_KEY,"CE"))
fn_ex <- get_fn_extr(m)
timetag <- format(Sys.time(), "%Y%m%d_%H%M%S")
saveRDS(object = m, file = file.path(m$run$paths$project,FN_EXTR_STATE))
m
}
##' @export
......@@ -523,11 +383,10 @@ conf_trans <- function(conf) {
prescreen <- function(m) {
## Top-level auto prescreening function.
message("(prescreen): Start.")
## confpres <- conf_trans_pres(m$conf$prescreen)
m$qa <- NULL
m$out$tab$summ <- NULL
m$qa <- analyse_extracted_data(m$extr,m$conf$prescreen)
m$out$tab$summ <- gen_summ(m$out$tab$comp,m$qa)
m$qa = NULL
m$out$tab$summ = NULL
m$qa = analyse_extracted_data(m$db,m$conf$prescreen)
m$out$tab$summ = gen_summ(m$db,m$qa,m$out$tab$comp)
message("(prescreen): End.")
m
}
......@@ -605,8 +464,8 @@ create_plots <- function(m) {
## Select the data nedeed for plotting.
flt_summ <- m$out$tab$flt_summ
ms2 <- m$extr$ms2
ms1 <- m$extr$ms1
ms2 <- m$db$extr$cgm$ms2
ms1 <- m$db$extr$cgm$ms1
rt_min <- rt_in_min(m$conf$figures$rt_min)
rt_max <- rt_in_min(m$conf$figures$rt_max)
keytab <- flt_summ[,unique(.SD),.SDcol=c("adduct","ID")]
......@@ -747,8 +606,8 @@ report <- function(m) {
dir.create(REP_TOPDIR,recursive = T,showWarnings = F)
header <- knitr::knit_expand(fn_header)
flt_summ <- m$out$tab$reptab
ms2 <- m$extr$ms2
ms1 <- m$extr$ms1
ms2 <- m$db$extr$cgm$ms2
ms1 <- m$db$extr$cgm$ms1
rt_min <- rt_in_min(m$conf$figures$rt_min)
rt_max <- rt_in_min(m$conf$figures$rt_max)
keytab <- flt_summ[,unique(.SD),.SDcol=c("adduct","ID")]
......@@ -922,7 +781,7 @@ metfrag <- function(m) {
path = m$run$metfrag$path,
subpaths = m$run$metfrag$subpaths,
db_file = m$run$metfrag$db_file,
stag_tab = stagtab, ms2 = m$extr$ms2,
stag_tab = stagtab, ms2 = m$db$extr$spectra,
runtime=m$run$metfrag$runtime,
java_bin=m$run$metfrag$java_bin,
nproc = m$conf$metfrag$nproc)
......@@ -942,3 +801,9 @@ metfrag <- function(m) {
m
}
make_db <- function(m) {
m = make_db_catalogue(m)
m = make_db_precursors(m)
m
}
......@@ -18,7 +18,13 @@
the_ifelse <- data.table::fifelse
dtable <- data.table::data.table
norm_path <- function(...) normalizePath(...,winslash='/')
norm_path <- function(...) {
test = nchar(...) > 0L
res = character(length(test))
res[test] = normalizePath(...[test],winslash='/')
res[!test] = ...[!test]
res
}
tab2file<-function(tab,file,...) {
data.table::fwrite(x=tab,file=file,...)
......@@ -143,3 +149,23 @@ uniqy_slugs <- function(slugs) {
dt = data.table::data.table(slug=slugs)
dt[,slug:=fifelse(rep(.N==1L,.N),slug,paste0(slug,"_",seq(1L,.N))),by="slug"]$slug
}
gen_val_unc <- function(x,dx) {
## Doesn't work well for <=0.
p = floor(log10(x))
dp = floor(log10(dx))
## Zero?
message("p ",p)
w = which(is.infinite(p))
p[w] = 0
## Normalise x and dx.
main = x/10**p
unc = round(dx/10**dp,0)
place = p - dp
main = mapply(function (m,d) formatC(m,digits=d,format='f',flag="#"),main,place,USE.NAMES=F)
w = which(main=='10.')
main[w]='1'
p[w]=p[w]+1
paste0(main,"(",unc,") x 10^",p)
}
#Copyright (C) 2023 by University of Luxembourg
## Shinyscreen works of an internal relational database implemented
## using `data.table' package. Implementation is here.
make_db_catalogue <- function(m) {
## Takes comprehensive database from state `m' and generates a
## catalogue with a unique key. This catalogue is based on
## inputs. Each entry in the catalogue corresponds to a single
## target mass from a single experimental run.
res = m$out$tab$comp[,unique(.SD),.SDcols=c("set","tag","adduct","ID")]
res[,catid:=.I]
setkeyv(res,DB_CATALOGUE_KEY)
setindex(res,catid)
m$db$cat = res
m
}
merge_precid_4_isobars <- function(orig_precids,masses,up_masses) {
start = head(orig_precids,1L)
n = length(orig_precids)
precid = orig_precids
i = 1L
while (i < n) {
theprecid = orig_precids[[i]]
themz = masses[[i]]
mzup = up_masses[[i]]
w = which(masses[(i+1L):n]<mzup)
precid[(i+1L):n][w] = theprecid
i = i + length(w) + 1L
}
precid
}
make_db_precursors <- function(m) {
## Generate masses and label isobars.
cat = m$db$cat
masses = m$out$tab$comp[cat,.(tag=tag,catid=catid,mz=mz,rt=rt),on=key(cat)]
setkey(masses,tag,mz)
## Retention time.
tmp = get_val_unit(m$conf$tolerance[['rt']])
rttol = as.numeric(tmp[['val']])
rtunit = tmp[['unit']]
if (rtunit == "s") {
rttol = rttol/60.
} else if (rtunit != "min") {
stop('make_db_precursors: Unknown retention time unit.')
}
masses[!is.na(rt),`:=`(rt_min=rt-rttol,rt_max=rt+rttol)]
## Fine error.
tmp = get_val_unit(m$conf$tolerance[['ms1 fine']])
ms1tol = as.numeric(tmp[['val']])
ms1unit = tmp[['unit']]
if (ms1unit == "ppm") {
masses[,`:=`(mz_fine_min=mz-ms1tol*mz*1e-6,mz_fine_max=mz+ms1tol*mz*1e-6)]
} else if (ms1unit == "Da") {
masses[,`:=`(mz_fine_min=mz-ms1tol,mz_fine_max=mz+ms1tol)]
} else {
stop('make_db_precursors: Unknown mass unit (fine).')
}
## Coarse error.
tmp = get_val_unit(m$conf$tolerance[['ms1 coarse']])
ms1tol = as.numeric(tmp[['val']])
ms1unit = tmp[['unit']]
if (ms1unit == "ppm") {
masses[,`:=`(mz_coarse_min=mz-ms1tol*mz*1e-6,mz_coarse_max=mz+ms1tol*mz*1e-6)]
} else if (ms1unit == "Da") {
masses[,`:=`(mz_coarse_min=mz-ms1tol,mz_coarse_max=mz+ms1tol)]
} else {
stop('make_db_precursors: Unknown mass unit (coarse).')
}
## Assign "fine" isobars to same isocoarse number.
masses[,precid:=merge_precid_4_isobars(catid,mz,mz_fine_max),by="tag"]
## Assign "coarse" isobars to same isocoarse number.
masses[,isocoarse:=merge_precid_4_isobars(catid,mz,mz_coarse_max),by="tag"]
masses[,`:=`(iso_coarse_min=min(mz_coarse_min),
iso_coarse_max=max(mz_coarse_max)),
by=isocoarse]
masses[,`:=`(iso_fine_min=min(mz_fine_min),
iso_fine_max=max(mz_fine_max)),
by=precid]
setindex(masses,isocoarse,precid)
## Add files.
filetab = m$input$tab$mzml[m$db$cat,
.(catid=i.catid,file=file),
on=c("set","tag"),nomatch=NULL]
masses[filetab,file:=i.file,on="catid"]
m$db$precursors = masses
m
}
empty_cgram_ms1 <- function(n=0L) {
r = data.table(file=character(n),
cgmidx=integer(n),
precid=integer(n),
scan=integer(n),
rt=numeric(n),
intensity=numeric(n))
setkey(r,precid,rt)
r
}
empty_cgram_ms2 <- function(n=0L) {
r = data.table(precid=integer(n),
ce=numeric(n),
scan=character(n),
idx=integer(n),
rt=numeric(n),
intensity=numeric(n))
setkey(r,precid,ce,idx)
r
}
empty_spectra_table <- function(n=0L) {
r = data.table(precid=integer(n),
scan=character(n),
mz=numeric(n),
intensity=numeric(n))
setkey(r,precid,scan)
r
}
summ_needs_from_cat <- function(cat) {
## Catalogue columns.
cat
}
summ_needs_from_precursors <- function(res,precursors) {
## Mass columns.
precursors[res,on=.(catid),.(precid,
mz,
set,
adduct,
tag,
ID,
mz_l=mz_fine_min,
mz_r=mz_fine_max),by=.EACHI]
}
summ_needs_from_qa <- function(res,qa) {
needs = qa[,.SD,.SDcols=c("precid",
"ce",
"scan",
"ms1_rt",
"ms1_int",
"ms2_rt",
"ms2_int",
"ms1_mean",
"ms2_sel",
"qa_pass",
"qa_ms1_exists",
"qa_ms2_exists",
"qa_ms1_good_int",
"qa_ms1_above_noise",
"qa_ms2_near",
"qa_ms2_good_int",
"qlt_ms1",
"qlt_ms2")]
res = needs[res,on=.(precid),allow.cartesian=T]
## TODO: additional processing?
res
}
summ_needs_from_comp <- function(res,comp) {
needs = comp[,.(set,ID,Name,SMILES)]
setkey(needs,set,ID)
res[needs,on=.(set,ID),`:=`(Name=i.Name,
SMILES=i.SMILES)]
}
## This function creates `summ' table.
gen_summ <- function(db,qa,comp) {
## Start with the basic things.
res = summ_needs_from_cat(db$cat)
## Add masses and precids.
res = summ_needs_from_precursors(res,db$precursors)
## Add qa columns.
res = summ_needs_from_qa(res,qa)
setkeyv(res,SUMM_KEY)
## Add comp columns.
summ_needs_from_comp(res,comp)
}
This diff is collapsed.
......@@ -36,7 +36,7 @@ metfrag_get_stag_tab <- function(summ) {
## Argument summ can be a subset of actual `summ' table.
x = gen_1d_keytab(summ)
data.table::setnames(x,old="key1d",new="stag")
res = x[summ,`:=`(CE=i.CE,ion_mz=mz)]
res = x[summ,`:=`(ce=i.ce,ion_mz=mz)]
res
}
......@@ -54,18 +54,21 @@ get_mf_res_ext <- function(fn) {
metfrag_run <- function(param,path,subpaths,db_file,stag_tab,ms2,runtime,java_bin,nproc = 1L) {
keys = intersect(colnames(stag_tab),colnames(ms2))
rms2 = ms2[stag_tab,on=keys,nomatch=NULL]
message("Generating MetFrag configs.")
file_tab = ms2[stag_tab,{
file_tab = rms2[,{
r = write_metfrag_config(param = ..param,
path = ..path,
subpaths = ..subpaths,
db_file = ..db_file,
stag = stag,
adduct = adduct,
ion_mz = ion_mz,
stag = first(stag),
adduct = first(adduct),
ion_mz = first(ion_mz),
spec = data.table(mz=mz,intensity=intensity))
c(r,stag = stag)
},by=.EACHI,on=keys]
c(r,stag = first(stag))
},keyby=keys]
message("Done generating MetFrag configs.")
withr::with_dir(path,{
......@@ -89,33 +92,30 @@ metfrag_run <- function(param,path,subpaths,db_file,stag_tab,ms2,runtime,java_bi
mf_narrow_summ <- function(summ,kv,ms2_rt_i=NA_integer_,ms2_rt_f=NA_integer_) {
skey = data.table::key(summ)
cols = c("adduct","tag","ID","CE","an","mz","qa_pass","ms2_rt")
nsumm = get_rows_from_summ(summ,kv,cols)
cols = union(names(skey),c("adduct","tag","ID","ce","precid","scan","mz","qa_pass","ms2_rt"))
dtkv = as.data.table(kv)
nsumm = summ[dtkv,on=names(kv),.SD,.SDcols=cols]
nsumm = nsumm[qa_pass==T] # Those that make sense.
nsumm_key = union(SUMM_KEY,"ms2_rt")
nsumm_key = intersect(union(SUMM_KEY,"ms2_rt"),colnames(nsumm))
data.table::setkeyv(nsumm,nsumm_key)
if (!is.na(ms2_rt_i)) {
nsumm = nsumm[ms2_rt>(ms2_rt_i)]
}
ms2_rt_i = if (!is.na(ms2_rt_i)) ms2_rt_i else 0.
ms2_rt_f = if (!is.na(ms2_rt_f)) ms2_rt_f else Inf
if (!is.na(ms2_rt_f)) {
nsumm = nsumm[ms2_rt<(ms2_rt_f)]
}
nsumm[ms2_rt > (ms2_rt_i) & ms2_rt < (ms2_rt_f)]
nsumm
}
get_metfrag_targets <- function(stag_tab,ms2) {
## Take the columns we need from summ.
x = summ[ms2_sel==T,.SD,.SDcols=c(key(summ),"mz")]
mrg_keys = c(intersect(key(ms2),key(summ)),"an")
mrg_keys = c(intersect(key(ms2),key(summ)),"scan")
x=ms2[x,.(CE=CE,ion_mz=i.mz,mz,intensity),on=mrg_keys,by=.EACHI]
## Get column order so that `an' follows `CE'.
resnms = setdiff(mrg_keys,"an")
nms = union(union(resnms,"CE"),c("an","ion_mz","mz","intensity"))
resnms = setdiff(mrg_keys,"scan")
nms = union(union(resnms,"CE"),c("scan","ion_mz","mz","intensity"))
data.table::setcolorder(x,neworder = nms)
setkeyv(x,unique(c(resnms,"CE","an")))
setkeyv(x,unique(c(resnms,"CE","scan")))
x
}
......@@ -220,7 +220,7 @@ summarise_metfrag_results <- function(param,path,subpaths,cand_parameters,db_sco
}
.adapt_col_types <- function(x) {
x[,(names(db_scores)):=lapply(.SD, as.numeric),.SDcol=names(db_scores)]
if (length(db_scores)>0) x[,(names(db_scores)):=lapply(.SD, as.numeric),.SDcol=names(db_scores)] else x
}
.calc_basic_scores <- function(x) {
......
......@@ -191,7 +191,7 @@ gen_empty_summ <- function() {
## ms2_cols <- intersect(colnames(qa_ms2),SUMM_COLS)
## ms2_cols <- setdiff(ms2_cols,colnames(summ))
## summ <- qa_ms2[summ,c(..comp_cols,..ms1_cols,..ms2_cols),on=BASE_KEY]
## data.table::setkeyv(summ,c(BASE_KEY_MS2,"an"))
## data.table::setkeyv(summ,c(BASE_KEY_MS2,"scan"))
## summ[,qa_ms1_exists:=the_ifelse(!is.na(qa_ms1_good_int),T,F)]
## summ[,qa_ms2_exists:=the_ifelse(!is.na(CE),T,F)]
## summ[,qa_pass:=apply(.SD,1,all),.SDcols=QA_FLAGS[!(QA_FLAGS %in% "qa_pass")]]
......@@ -547,274 +547,6 @@ conf_trans_pres <- function(pres_list) {
pres_list
}
## PRESCREENING
create_qa_table <- function(extr,conf_presc) {
## The first input argument is the extracted `ms`, table
## containing MS1 and MS2 spectra. The argument `conf_presc` is
## m$conf$prescreen, the prescreening parameters given in the conf
## file.
## The qa table is just a copy of ms with added quality control
## columns QA_COLS.
## The QA_FLAGS columns are flags specifying which properties of
## compounds are known well, or not.
## For each compound (mass) we ask the following questions:
## qa_ms1_exists -- does the MS1 spectrum exist at all?
## qa_ms2_exists -- do we have any MS2 spectra at all?
## qa_ms1_above_noise -- is MS1 above the noise treshold?
## qa_ms2_near -- is there any MS2 spectrum inside the tolerated
## retention time window around the MS1 peak? That is, are we
## non-RT-shifted?
## qa_ms2_good_int -- Is there any MS2 spectral intensity greater
## than the MS2 threshold and less than the MS1 peak?
## qa_pass -- did the spectrum pass all the checks?
## The columns in QA_NUM_REAL are:
##
## ms1_int -- the maximum intensity of MS1 spectrum over the
## entire run;
##
## ms1_rt -- the retention time of the peak MS1.
## The columns in QA_NUM_INT are:
##
## ms2_sel -- index of the selected MS2 spectrum; if not NA, the
## associated spectrum passed all the checks (qa_pass == T); the
## spectrum itself is in one of the member sublists of the `spec'
## column. The integer `ms2_sel' is then the index of the spectrum
## in that sublist.
##
## ms1_rt_ind -- TODO (but not important to end users).
qa <- list(prescreen=conf_presc)
## checks <- extr$ms2[,{
## },keyby=BASE_KEY_MS2]
checks <- extr$ms2
checks[,(QA_FLAGS):=F]
## message("checks:")
## print(checks)
## message("done checks")
checks[,(QA_NUM_INT):=NA_integer_]
checks[,(QA_NUM_REAL):=NA_real_]
setkeyv(checks,BASE_KEY_MS2)
qa$checks <- checks
qa
}
assess_ms1 <- function(m) {
qa <- m$qa
ms1 <- m$extr$ms1
## Calculate auxiliary variables and indices.
qa_ms1 <- ms1[,.(ms1_rt_ind=which.max(intensity)),keyby=BASE_KEY]
qa_ms1 <- ms1[qa_ms1,.(ms1_rt_ind=ms1_rt_ind,
ms1_int=intensity[[ms1_rt_ind]],
ms1_rt=rt[[ms1_rt_ind]],
ms1_mean=mean(intensity)),on=BASE_KEY,by=.EACHI]
qa_ms1[,qa_ms1_good_int := ms1_int > qa$prescreen$ms1_int_thresh]
qa_ms1[,qa_ms1_above_noise := F]
qa_ms1[qa_ms1_good_int==T,qa_ms1_above_noise := .(ms1_int > qa$prescreen$s2n*ms1_mean)]
## checks[(!qa_ms1_above_noise),c("qa_ms2_good_int","qa_ms2_near","qa_ms2_exists","qa_pass"):=F]
## qa_ms1 <- check_ms1_noise(check_ms1(qa_ms1))
m$qa$ms1 <- qa_ms1
m
}
assess_ms2 <- function(m) {
presconf <- conf_trans_pres(m$conf$prescreen)
ms1 <- m$extr$ms1
ms2 <- m$extr$ms2
qa_ms1 <- m$qa$ms1
qa_ms2 <- ms2[qa_ms1[qa_ms1_above_noise==T],.(CE=unique(CE),
pc_rt=i.ms1_rt,
pc_int=i.ms1_int,
an=unique(an)),on=BASE_KEY,by=.EACHI,nomatch=NULL]
rt_win2 <- presconf$ret_time_shift_tol
qa_ms2 <- ms2[qa_ms2,.(pc_rt=pc_rt,
pc_int=pc_int,
ms2_int=max(intensity),
ms2_rt=unique(rt),
qa_ms2_near=head(rt,1) < pc_rt + rt_win2 & head(rt,1) > pc_rt - rt_win2),
by=.EACHI,on=c(BASE_KEY_MS2,"an")]
qa_ms2$qa_ms2_good_int <-F
qa_ms2[qa_ms2_near==T,
qa_ms2_good_int := ms2_int > presconf$ms2_int_thresh & ms2_int < pc_int,
by=c(BASE_KEY_MS2,"an")]
## qa_ms2$qa_pass <- F
## qa_ms2[qa_ms2_good_int==T,qa_pass:=T]
qa_ms2$ms2_sel <- F
qa_ms2[qa_ms2_good_int==T,ms2_sel:={
ind<-which.min(abs(ms2_rt-pc_rt))
z<-ms2_sel
z[[ind]]<-T
z
},by=BASE_KEY_MS2]
setkeyv(qa_ms2,BASE_KEY_MS2)
m$qa$ms2 <- qa_ms2
m
}
## Analyze extracted data.
analyse_extracted_data <- function(extr,prescreen_param) {
ms1 <- extr$ms1
ms2 <- extr$ms2
## Parameters.
presconf <- conf_trans_pres(prescreen_param)
rt_shift <- presconf$ret_time_shift_tol
det_ms2_noise <- presconf$det_ms2_noise
ms2_int_thresh <- presconf$ms2_int_thresh
ms1_int_thresh <- presconf$ms1_int_thresh
## Detect MS2 noise.
ms2_clc_ns <- if (det_ms2_noise) {
if (ms2_int_thresh>0) {
warning("Ignore user specified ms2_int_thresh value and autodetect noise instead.")
}
ms2[,ms2_thr:=0.33333333*mean(intensity),by="tag"]
} else {
ms2[,ms2_thr:=ms2_int_thresh]
}
## message('ms2_clc_ns:',key(ms2_clc_ns))
## ms2_clc_ns <- ms2_clc_ns[intensity>ms2_thr]
## We drop mz info.
tab_ms2 <- ms2_clc_ns[,.(ms2_rt=first(rt),ms2_int=max(intensity),ms2_thr=first(ms2_thr)),by=c(BASE_KEY_MS2,'an')]
tab_ms2[,qa_ms2_good_int:=ms2_int>ms2_thr,by="an"]
data.table::setkeyv(tab_ms2,BASE_KEY_MS2)
tab_ms2[,`:=`(rt_left = ms2_rt - rt_shift,rt_right = ms2_rt + rt_shift)]
## Get mean ms1 value.
tab_ms1 <- extr$ms1
tab_ms1_mean <- tab_ms1[,.(ms1_mean=mean(intensity)),keyby=BASE_KEY]
## This function extracts intensity maxima on intervals given by
## RT vectors rt_1 and rt_2.
find_ms1_max <- function(rt,intensity,rt_1,rt_2)
{
mapply(function (rt_1,rt_2) {
rt_ival <- c(rt_1,rt_2)
intv <- findInterval(rt,rt_ival)
lintv = length(intv)
if (intv[1]==0L && intv[lintv] == 2L) {
pos = match(c(1L,2L),intv)
} else if (intv[1]==1L && intv[lintv]!=1L) {
pos = c(1L,match(2L,intv))
} else if (intv[1]==0L && intv[lintv]!=0L) {
pos = c(match(1L,intv),lintv)
} else {
pos = c(1L,lintv)
}
pmax = pos[[1]] + which.max(intensity[pos[[1]]:pos[[2]]]) - 1L
c(rt[pmax],intensity[pmax])
}, rt_1, rt_2, USE.NAMES=F)
}
## Perform MS1 maxima calculation in the neighbourhood of each
## MS2 result.
tmp = tab_ms1[tab_ms2,{
xx = find_ms1_max(rt,intensity,i.rt_left,i.rt_right)
.(an=i.an,
ms1_rt = xx[1,],
ms1_int = xx[2,])
},by=.EACHI, nomatch=NULL]
## Calculate QA values.
tab_ms2[tmp,c("ms1_rt","ms1_int"):=.(i.ms1_rt,i.ms1_int),on=c(BASE_KEY,'an')]
tab_ms2[,c("rt_left","rt_right"):=c(NULL,NULL)]
tab_ms2[tab_ms1_mean,ms1_mean:=i.ms1_mean]
tab_ms2[,`:=`(qa_ms1_good_int=fifelse(ms1_int>ms1_int_thresh,T,F),
qa_ms1_above_noise=F,
qa_ms2_near=F)]
## TODO: I wonder if so stupidly auto-calculated ms1 noise should
## be taken into account at all? My recollection from earlier
## times was that it was helpful, at least sometimes.
tab_ms2[qa_ms1_good_int==T,qa_ms1_above_noise:=fifelse(ms1_int>ms1_mean/3.,T,F)]
tab_ms2[qa_ms1_good_int==T & qa_ms1_above_noise==T & qa_ms2_good_int==T,qa_ms2_near:=T]
tab_ms2$qa_ms2_exists=T
## Find MS1 with no corresponding MS2.
ms1key <- tab_ms1[,unique(.SD),.SDcol=BASE_KEY]
ms2key <- tab_ms2[,unique(.SD),.SDcol=BASE_KEY]
ms2key$mch=T
tabmatches <- ms2key[ms1key]
ms1woms2 <- tabmatches[is.na(mch)][,mch:=NULL]
## calculate the most intense peak, its location and the mean for
## childless MS1.
tab_noms2 <- tab_ms1[ms1woms2,.(ms1_mean=mean(intensity),ms1_rt=rt[which.max(intensity)],ms1_int=max(intensity)),by=.EACHI,nomatch=NULL]
## QA for the above (lazy qa ... take only the max peak into account).
tab_noms2[,c("qa_ms1_good_int","qa_ms1_above_noise"):=.(ms1_int>ms1_int_thresh,ms1_int>ms1_mean/3.)]
## MS2 QA criteria all fail.
tab_noms2[,c("qa_ms2_exists","qa_ms2_good_int","qa_ms2_near"):=.(F,F,F)]
## Bind MS1-only and MS1/MS2 entries together.
res <- rbind(tab_ms2,tab_noms2,fill=T,use.names=T)
## Every single entry which was extracted has at least MS1.
res[,qa_ms1_exists:=T]
data.table::setkeyv(res,BASE_KEY)
qflg <- QA_FLAGS[!(QA_FLAGS %in% "qa_pass")]
res[,qa_pass:=apply(.SD,1,all),.SDcols=qflg]
res[.(T),del_rt:=abs(ms2_rt - ms1_rt),on="qa_pass",by='an']
resby <- BASE_KEY_MS2[! (BASE_KEY_MS2 %in% 'an')]
res[.(T),qa_tmp_ms1_max:= ms1_int==max(ms1_int),on="qa_pass",by=resby]
res[,ms2_sel:=F]
res[.(T,T),ms2_sel:= del_rt == del_rt[which.min(del_rt)],on=c("qa_pass","qa_tmp_ms1_max"),by=resby]
res[,qlt_ms1:=apply(.SD,1,function(rw) sum(c(5L,3L,2L)*rw)),.SDcol=c("qa_ms1_exists",
"qa_ms1_above_noise",
"qa_ms1_good_int")]
res[,qlt_ms2:=apply(.SD,1,function(rw) sum(c(5L,3L,2L)*rw)),.SDcol=c("qa_ms2_exists",
"qa_ms2_near",
"qa_ms2_good_int")]
res
}
## Based on the `comprehensive' and `qa' tabs, greate `summ'.
gen_summ <- function(comp,qa) {
comp_cols <- intersect(SUMM_COLS,colnames(comp))
rdcomp <- comp[,..comp_cols]
data.table::setkeyv(rdcomp,BASE_KEY)
summ <- qa[rdcomp,nomatch=F] #We changed `nomatch' cases from NA
#to F, because NA does not work well
#with X == F condition.
## flgs <- c(QA_FLAGS,"ms2_sel")
## summ[is.na(qa_ms1_exists),(flgs):=F]
data.table::setkeyv(summ,SUMM_KEY)
summ[.(F),c("qlt_ms1","qlt_ms2"):=0.,on="qa_ms1_exists"]
summ
}
gen_mz_err_f <- function(entry,msg) {
eppm <- grab_unit(entry,"ppm")
eda <- grab_unit(entry,"Da")
......
This diff is collapsed.
......@@ -124,13 +124,14 @@ NUM_INP_HEIGHT="5%"
## Possible compound list fields
EMPTY_CMPD_LIST = dtable(ID=character(),
SMILES=character(),
Name=character(),
Formula=character(),
RT=numeric(),
mz=numeric(),
known=character(),
ORIG=character())
SMILES=character(),
Name=character(),
Formula=character(),
RT=numeric(),
mz=numeric(),
known=character(),
set=character(),
ORIG=character())
COMP_LIST_COLS = c("ID","Name","SMILES","Formula","RT","mz")
## Comprehensive table properties
COMP_NAME_MAP = list(RT="rt")
......@@ -204,8 +205,8 @@ REPORT_TITLE = "Plots of EICs and MS2 Spectra"
## Select the most fundamental group of entries. Within this group,
## each ID is unique.
BASE_KEY = c("adduct","tag","ID")
BASE_KEY_MS2 = c(BASE_KEY,"CE","an")
BASE_KEY = "precid"#c("adduct","tag","ID")
BASE_KEY_MS2 = c("precid","ce","scan")#c(BASE_KEY,"CE","scan")
FIG_DEF_CONF =list(grouping=list(group="adduct",
plot="ID",
......@@ -216,7 +217,7 @@ FIG_DEF_CONF =list(grouping=list(group="adduct",
SUMM_COLS=c("set",BASE_KEY_MS2,"mz","ms1_rt", "ms1_int", "ms2_rt", "ms2_int",
"ms1_mean","ms2_sel",QA_FLAGS,"Name", "SMILES", "Formula", "known","Comments","file")
SUMM_KEY = c("set","ID","adduct","tag","an")
SUMM_KEY = c("set","ID","adduct","tag","precid","catid","scan")
PLOT_FEATURES = c("adduct",
"tag",
......@@ -224,34 +225,34 @@ PLOT_FEATURES = c("adduct",
## Empty summary table.
EMPTY_SUMM = data.table::data.table(set=character(0),
adduct=character(0),
tag=character(0),
ID=character(0),
CE=character(0),
an=integer(0),
mz=numeric(0),
ms1_rt=numeric(0),
ms1_int=numeric(0),
ms2_rt=numeric(0),
ms2_int=numeric(0),
ms1_mean=numeric(0),
ms2_sel=logical(0),
qa_pass=logical(0),
qa_ms1_exists=logical(0),
qa_ms2_exists=logical(0),
qa_ms1_good_int=logical(0),
qa_ms1_above_noise=logical(0),
qa_ms2_near=logical(0),
qa_ms2_good_int=logical(0),
Name=character(0),
SMILES=character(0),
Formula=character(0),
known=character(0),
Comments=character(0),
file=character(0))
adduct=character(0),
tag=character(0),
ID=character(0),
CE=character(0),
scan=character(0),
mz=numeric(0),
ms1_rt=numeric(0),
ms1_int=numeric(0),
ms2_rt=numeric(0),
ms2_int=numeric(0),
ms1_mean=numeric(0),
ms2_sel=logical(0),
qa_pass=logical(0),
qa_ms1_exists=logical(0),
qa_ms2_exists=logical(0),
qa_ms1_good_int=logical(0),
qa_ms1_above_noise=logical(0),
qa_ms2_near=logical(0),
qa_ms2_good_int=logical(0),
Name=character(0),
SMILES=character(0),
Formula=character(0),
known=character(0),
Comments=character(0),
file=character(0))
## Default sorting keys of spectra in the summary table
DEF_KEY_SUMM = c(BASE_KEY_MS2,"an")
DEF_KEY_SUMM = c(BASE_KEY_MS2,"scan")
SUBSET_VALS = c(IGNORE="ignore",
......@@ -360,3 +361,7 @@ METFRAG_RESULT_READF = list(csv = function(file,...) data.table::fread(file=file
xml = function(file,...) readxl::read_excel(path=file,...))
METFRAG_DEFAULT_PROC = 1L
## DATA MODEL
DB_CATALOGUE_KEY = c("set","tag","adduct","ID")
......@@ -22,7 +22,7 @@
#' @importFrom shiny selectInput numericInput textInput HTML
GUI_SELECT_INPUTS <- c("proj_list",
GUI_SELECT_INPUTS = c("proj_list",
"indir_list",
"ms1_coarse_unit",
"ms1_fine_unit",
......@@ -30,7 +30,7 @@ GUI_SELECT_INPUTS <- c("proj_list",
"ret_time_shift_tol_unit",
"dfile_list")
GUI_NUMERIC_INPUTS <- c("ms1_coarse",
GUI_NUMERIC_INPUTS = c("ms1_coarse",
"ms1_fine",
"ms1_eic",
"ms1_rt_win",
......@@ -39,14 +39,14 @@ GUI_NUMERIC_INPUTS <- c("ms1_coarse",
"s2n",
"ret_time_shift_tol")
GUI_TEXT_INPUTS <- c("rep_aut",
GUI_TEXT_INPUTS = c("rep_aut",
"rep_tit")
GUI_RADIO_INPUTS <- c("missingprec")
GUI_RADIO_INPUTS = c("missingprec")
GUI_ALL_INPUTS <- c(GUI_SELECT_INPUTS,
GUI_ALL_INPUTS = c(GUI_SELECT_INPUTS,
GUI_NUMERIC_INPUTS,
GUI_TEXT_INPUTS,
GUI_RADIO_INPUTS)
......@@ -55,8 +55,8 @@ GUI_ALL_INPUTS <- c(GUI_SELECT_INPUTS,
add_new_def_tag <- function(old_tags,how_many) {
ind = which(grepl(r"(^F\d+$)",old_tags))
st_num = if (length(ind)>0L) {
old_def_tags <- old_tags[ind]
tag_nums <- gsub(r"(^F(\d+)$)",r"(\1)",old_def_tags)
old_def_tags = old_tags[ind]
tag_nums = gsub(r"(^F(\d+)$)",r"(\1)",old_def_tags)
max(as.integer(tag_nums))
......@@ -162,26 +162,26 @@ create_gui <- function(project_path=NA_character_) {
#'@export
r2datatab <- function(rdatatab) {
shiny::isolate({
file <- rdatatab$file
adduct <- rdatatab$adduct
tag <- rdatatab$tag
set <- rdatatab$set
file = rdatatab$file
adduct = rdatatab$adduct
tag = rdatatab$tag
set = rdatatab$set
})
if (length(file)==0L) file <- character(0)
if (length(adduct)==0L) adduct <- rep(NA_character_,length(file))
if (length(tag)==0L) tag <- rep(NA_character_,length(file))
if (length(set)==0L) tag <- rep(NA_character_,length(file))
if (length(file)==0L) file = character(0)
if (length(adduct)==0L) adduct = rep(NA_character_,length(file))
if (length(tag)==0L) tag = rep(NA_character_,length(file))
if (length(set)==0L) tag = rep(NA_character_,length(file))
data.table(tag=tag,adduct=adduct,set=set,file=file)
}
r2filetag <- function(rfiletag) {
shiny::isolate({
file <- rfiletag$file
tag <- rfiletag$tag
file = rfiletag$file
tag = rfiletag$tag
})
if (length(file)==0L) file <- character(0)
if (length(tag)==0L) tag <- rep(NA_character_,length(file))
if (length(file)==0L) file = character(0)
if (length(tag)==0L) tag = rep(NA_character_,length(file))
data.table(tag=tag,file=file)
}
......@@ -193,7 +193,7 @@ gen_dtab <- function(tablist,sets) {
r2compounds <- function(rcompounds) {
shiny::isolate({
cmpd_lists <- rcompounds$lists
cmpd_lists = rcompounds$lists
})
list(lists=cmpd_lists)
......@@ -202,17 +202,17 @@ r2compounds <- function(rcompounds) {
#' @export
pack_app_state <- function(input, gui) {
pack <- list()
pack = list()
shiny::isolate({
pack_inputs <- list()
pack_input_names <- which_gui_inputs(inputs)
pack_inputs <- shiny::reactiveValuesToList(input)[pack_input_names]
pack$input <- pack_inputs
pack$datatab <- r2datatab(gui$datatab)
pack$filetag <- r2filetag(gui$filetag)
pack$compounds <- r2compounds(gui$compounds)
pack$paths <- list()
pack$paths$data <- gui$paths$data
pack_inputs = list()
pack_input_names = which_gui_inputs(inputs)
pack_inputs = shiny::reactiveValuesToList(input)[pack_input_names]
pack$input = pack_inputs
pack$datatab = r2datatab(gui$datatab)
pack$filetag = r2filetag(gui$filetag)
pack$compounds = r2compounds(gui$compounds)
pack$paths = list()
pack$paths$data = gui$paths$data
})
pack
......@@ -318,19 +318,21 @@ unpack_app_state <- function(session,envopts,input,top_data_dir,project_path,pac
input=input,
packed_state=packed_state)
gui <- create_gui(project_path=project_path)
gui$compounds$lists <- packed_state$compounds$lists
gui$datatab$file <- packed_state$datatab$file
gui$datatab$adduct <- packed_state$datatab$adduct
gui$datatab$tag <- packed_state$datatab$tag
gui$datatab$set <- packed_state$datatab$set
gui$filetag$file <- packed_state$filetag$file
gui$filetag$tag <- packed_state$filetag$tag
x <- packed_state$paths$data
gui$paths$data = if (length(x)>0 && nchar(x)>0) basename(x) else ""
if (!dir.exists(file.path(top_data_dir,gui$paths$data))) {warning("Data directory ", gui$paths$data, " does not exist. You must select one.")}
gui = create_gui(project_path=project_path)
gui$compounds$lists = packed_state$compounds$lists
gui$datatab$file = packed_state$datatab$file
gui$datatab$adduct = packed_state$datatab$adduct
gui$datatab$tag = packed_state$datatab$tag
gui$datatab$set = packed_state$datatab$set
gui$filetag$file = packed_state$filetag$file
gui$filetag$tag = packed_state$filetag$tag
x = packed_state$paths$data
gui$paths$data = if (length(x)>0 && nchar(x)>0) {
file.path(top_data_dir,basename(x))
} else ""
if (!dir.exists(gui$paths$data)) {warning("Data directory ", gui$paths$data, " does not exist. You must select one.")}
gui
})
......@@ -340,12 +342,12 @@ unpack_app_state <- function(session,envopts,input,top_data_dir,project_path,pac
input2conf_setup <- function(input,gui,conf=list()) {
if (length(conf)==0L) {
conf$compounds <- list()
conf$summary_table <- list()
conf$debug <- F
conf$compounds = list()
conf$summary_table = list()
conf$debug = F
}
conf$compounds$lists <- gui$compounds$lists
conf$compounds$lists = gui$compounds$lists
conf$paths$data = basename(gui$paths$data)
conf
}
......@@ -353,11 +355,11 @@ input2conf_setup <- function(input,gui,conf=list()) {
input2conf_extract <- function(input,conf) {
conf$tolerance = list()
conf$extract = list()
conf$tolerance[["ms1 fine"]] <- paste(input$ms1_fine,input$ms1_fine_unit)
conf$tolerance[["ms1 coarse"]] <- paste(input$ms1_coarse,input$ms1_coarse_unit)
conf$tolerance[["eic"]] <- paste(input$ms1_eic,input$ms1_eic_unit)
conf$tolerance[["rt"]] <- paste(input$ms1_rt_win,input$ms1_rt_win_unit)
conf$extract$missing_precursor_info <- input$missingprec
conf$tolerance[["ms1 fine"]] = paste(input$ms1_fine,input$ms1_fine_unit)
conf$tolerance[["ms1 coarse"]] = paste(input$ms1_coarse,input$ms1_coarse_unit)
conf$tolerance[["eic"]] = paste(input$ms1_eic,input$ms1_eic_unit)
conf$tolerance[["rt"]] = paste(input$ms1_rt_win,input$ms1_rt_win_unit)
conf$extract$missing_precursor_info = input$missingprec
conf
}
......@@ -365,26 +367,26 @@ input2conf_extract <- function(input,conf) {
input2conf_prescreen <- function(input,conf) {
conf$prescreen = list()
conf$prescreen[["ms1_int_thresh"]] <- input$ms1_int_thresh
conf$prescreen[["ms2_int_thresh"]] <- input$ms2_int_thresh
conf$prescreen[["s2n"]] <- input$s2n
conf$prescreen[["ret_time_shift_tol"]] <- paste(input$ret_time_shift_tol,input$ret_time_shift_tol_unit)
conf$prescreen[["ms1_int_thresh"]] = input$ms1_int_thresh
conf$prescreen[["ms2_int_thresh"]] = input$ms2_int_thresh
conf$prescreen[["s2n"]] = input$s2n
conf$prescreen[["ret_time_shift_tol"]] = paste(input$ret_time_shift_tol,input$ret_time_shift_tol_unit)
conf
}
input2conf_figures <- function(input,conf) {
conf$figures = list()
conf$figures$rt_min <- paste(input$plot_rt_min,input$plot_rt_min_unit)
conf$figures$rt_max <- paste(input$plot_rt_max,input$plot_rt_max_unit)
conf$figures$ext <- input$plot_ext
conf$figures$rt_min = paste(input$plot_rt_min,input$plot_rt_min_unit)
conf$figures$rt_max = paste(input$plot_rt_max,input$plot_rt_max_unit)
conf$figures$ext = input$plot_ext
conf
}
input2conf_report <- function(input,conf) {
conf$report = list()
conf$report$author <- input$rep_aut
conf$report$title <- input$rep_tit
conf$report$author = input$rep_aut
conf$report$title = input$rep_tit
conf
}
......@@ -401,7 +403,7 @@ input2conf_metfrag <- function(input,conf) {
MetFragPreProcessingCandidateFilter = paste(input$mf_pre_processing_candidate_filter,collapse=","),
MetFragPostProcessingCandidateFilter = paste(input$mf_post_processing_candidate_filter,collapse=","))
## TODO: FIXME: We need to move away from unit weights from some
## TODO: FIXME: We need to move away from unit weights at some
## point. This needs some extra widgets (Sigh!).
insc = sapply(input$mf_scores_intrinsic,function(x) 1.0)
names(insc) = input$mf_scores_intrinsic
......@@ -425,7 +427,7 @@ app_update_conf <- function(input,gui,envopts,fconf,m) {
fstr = paste0("input2conf_",fstrp)
m$conf = do.call(fstr,list(input,conf=m$conf))
}
m$run <- new_runtime_state(project=gui$paths$project,
m$run = new_runtime_state(project=gui$paths$project,
envopts = envopts,
conf=m$conf)
m
......@@ -453,28 +455,24 @@ app_state2state <- function(input,gui,envopts,m=NULL) {
}
get_sets <- function(gui) {
## TODO FIXME
## Think about this
## fn_lists <- file.path(gui$paths$project,gui$compounds$lists)
## df <- fread(file=fn_lists)
## if (!
## res = df[,unique(set)]
if (length(res)==0L) res = "ALL"
fn_lists = file.path(gui$paths$project,gui$compounds$lists)
cmpds = join_compound_lists(fn_lists)
cmpds = process_cmpd_sets(cmpds)
cmpds[,unique(set)]
}
gen_dfiles_tab <- function(gui) {
curr_file <- gui$filetag$file
curr_tag <- gui$filetag$tag
curr_file = gui$filetag$file
curr_tag = gui$filetag$tag
res <- data.table(file=curr_file,tag=curr_tag)
res = data.table(file=curr_file,tag=curr_tag)
res
}
gui2datatab <- function(gui) {
df <- data.table(tag=as.character(gui$datatab$tag),
df = data.table(tag=as.character(gui$datatab$tag),
adduct=as.character(gui$datatab$adduct),
set=as.character(gui$datatab$set),
file=as.character(gui$datatab$file))
......@@ -493,55 +491,59 @@ gui2datatab <- function(gui) {
## (of tags, CEs) in the index.
gen_cindex <- function(summ,sorder,cols = CINDEX_COLS,by. = CINDEX_BY) {
if (NROW(summ) == 0L) return(NULL)
allc <- c(by.,cols)
xsumm <- summ[,..allc]
allc = c(by.,cols)
xsumm = summ[,..allc]
setnames(xsumm,old="ms1_rt",new="rt",skip_absent=T)
res <- xsumm[,.SD[max(qlt_ms1)==qlt_ms1][max(qlt_ms2)==qlt_ms2],by=by.]
res <- res[,c("mz","rt","Name","qlt_ms1","qlt_ms2"):=.(first(mz),
first(mean(rt)),
first(Name),
first(qlt_ms1),
first(qlt_ms2)),
res = xsumm[,.SD[max(qlt_ms1)==qlt_ms1][max(qlt_ms2)==qlt_ms2],by=by.]
res = res[,c("mz","rt","Name","qlt_ms1","qlt_ms2"):=.(mean(mz,na.rm=T),
mean(rt,na.rm=T),
first(Name),
max(qlt_ms1,na.rm=T),
max(qlt_ms2,na.rm=T)),
by=by.]
res <- res[,unique(.SD),by=by.]
res = res[,unique(.SD),by=by.]
sorder <- unique(sorder)
wna <- which(sorder=="nothing"); if (length(wna)>0L) sorder <- sorder[-wna]
quality <- which("quality"==sorder)
sorder = unique(sorder)
wna = which(sorder=="nothing"); if (length(wna)>0L) sorder = sorder[-wna]
quality = which("quality"==sorder)
if (length(quality)>0L) {
pre <- head(sorder,quality-1L)
post <- tail(sorder,-quality)
sorder <- c(pre,"qlt_ms1","qlt_ms2",post)
pre = head(sorder,quality-1L)
post = tail(sorder,-quality)
sorder = c(pre,"qlt_ms1","qlt_ms2",post)
}
ord <- rep(1L,length(sorder))
ord = rep(1L,length(sorder))
if ("qlt_ms1" %in% sorder) {
ind <- which(sorder %in% c("qlt_ms1","qlt_ms2"))
ord[ind] <- -1L
ind = which(sorder %in% c("qlt_ms1","qlt_ms2"))
ord[ind] = -1L
}
if (length(sorder)>0) setorderv(res,cols=sorder,order=ord)
setnames(res,old="rt",new="rt(ms1)")
## Remove confusing columns.
res[,c("rt","mz"):=NULL]
res[,c("qlt_ms1","qlt_ms2"):=.(signif(100*qlt_ms1/10.,3),signif(100*qlt_ms2/10.,3))]
setnames(res,c("qlt_ms1","qlt_ms2"),c("best score (ms1)","best score (ms2)"))
res
}
cindex_from_input <- function(clabs,sort_catg=character(4),summ) {
grp <- if (isTruthy(clabs)) setdiff(CINDEX_BY,clabs) else CINDEX_BY
sorder <- setdiff(sort_catg,clabs)
grp = if (isTruthy(clabs)) setdiff(CINDEX_BY,clabs) else CINDEX_BY
sorder = setdiff(sort_catg,clabs)
gen_cindex(summ,sorder=sorder,by=grp)
}
get_cindex_key <- function(cindex) {
## Select only valid category names.
x <- which(CINDEX_BY %in% names(cindex))
x = which(CINDEX_BY %in% names(cindex))
CINDEX_BY[x]
}
get_cindex_parents <- function(summ,ckey,kvals,labs) {
## Get kvals part of summ.
tab <- summ[(kvals),on=names(kvals)][,unique(.SD),.SDcols=labs,by=ckey] #get_data_from_key(summ,kvals)
tab = summ[(kvals),on=names(kvals)][,unique(.SD),.SDcols=labs,by=ckey] #get_data_from_key(summ,kvals)
tab[,item:=do.call(paste,c(.SD,list(sep=";"))),.SDcol=labs]
keys <- names(tab)[names(tab)!="item"]
keys = names(tab)[names(tab)!="item"]
data.table::setkeyv(tab,keys)
tab
}
......@@ -550,28 +552,28 @@ get_cindex_kval <- function(cindex,row,key) {
## Accounting for not fully initialised state.
if (!is.numeric(row) || is.na(row) || length(key)==0L || is.na(key) || NROW(cindex)==0L) return(NULL)
rowtab <- cindex[(row),..key]
res <- lapply(rowtab,function (x) x[[1]])
names(res) <- key
rowtab = cindex[(row),..key]
res = lapply(rowtab,function (x) x[[1]])
names(res) = key
res
}
get_summ_subset <- function(summ,ptab,paritem,kvals) {
select <- ptab[item==(paritem)]
tab <- get_data_from_key(summ,kvals)[select,nomatch=NULL,on=key(ptab)]
if ("an.1" %in% names(tab)) tab[,an.1:=NULL] #TODO: This is
get_summ_subset <- function(db,summ,ptab,paritem,kvals) {
select = ptab[item==(paritem)]
tab = get_data_from_key(db=db,tab=summ,kvals=kvals)[select,nomatch=NULL,on=key(ptab)]
if ("scan.1" %in% names(tab)) tab[,scan.1:=NULL] #TODO: This is
#probably a lousy
#hack.
tab
}
get_ltab <- function(summ_subs,cols=c("an","ms2_rt")) {
tab <- summ_subs
get_ltab <- function(summ_subs,cols=c("scan","ms2_rt")) {
tab = summ_subs
if (NROW(tab)==1L && is.na(tab$an)) return(data.table::data.table(item=character()))
tab[is.na(ms2_sel),ms2_sel:=F] #TODO FIXME: Check why NAs exist at all?
tab[is.na(ms2_sel),ms2_sel:=F]
tab[,passval:=fifelse(qa_pass==T,"OK","BAD")]
tab[ms2_sel==T,passval:="SELECTED"]
res <- tab[,item:=do.call(paste,c(.SD,list(sep=";"))),.SDcols=c(cols,"passval")]
res = tab[,item:=do.call(paste,c(.SD,list(sep=";"))),.SDcols=c(cols,"passval")]
data.table::setkey(res,"ms2_rt")
res
}
......@@ -585,18 +587,18 @@ update_on_commit_chg <- function(summ,input,ptab,ltab) {
n_ms2_sel = input$chg_ms2sel
sel_par <- input$sel_parent_trace
sel_spec <- input$sel_spec
sel_par = input$sel_parent_trace
sel_spec = input$sel_spec
pkvals <- ptab[item==(sel_par),.SD,.SDcols=intersect(SUMM_KEY,names(ptab))]
lkvals <- ltab[item==(sel_spec),.SD,.SDcols=intersect(SUMM_KEY,names(ltab))]
kvals <- c(as.list(pkvals),as.list(lkvals))
kvals <- kvals[unique(names(kvals))]
pkvals = ptab[item==(sel_par),.SD,.SDcols=intersect(SUMM_KEY,names(ptab))]
lkvals = ltab[item==(sel_spec),.SD,.SDcols=intersect(SUMM_KEY,names(ltab))]
kvals = c(as.list(pkvals),as.list(lkvals))
kvals = kvals[unique(names(kvals))]
if ('an' %in% names(kvals) && n_ms2_sel) {
rkvals <- kvals[!(names(kvals) %in% 'an')]
rktab <- tabkey(summ,kvals=rkvals)
tabsel <- summ[rktab,.(an,ms2_sel)]
ansel <- tabsel[ms2_sel == T,an]
rkvals = kvals[!(names(kvals) %in% 'an')]
rktab = tabkey(summ,kvals=rkvals)
tabsel = summ[rktab,.(scan,ms2_sel)]
ansel = tabsel[ms2_sel == T,scan]
print('ansel')
print(ansel)
if (length(ansel)!=0) {
......@@ -607,29 +609,29 @@ update_on_commit_chg <- function(summ,input,ptab,ltab) {
}
tgts <- c("ms1_rt","ms1_int",names(n_qa),"ms2_sel")
srcs <- c(list(n_ms1_rt,n_ms1_int),as.list(n_qa),as.list(n_ms2_sel))
tgts = c("ms1_rt","ms1_int",names(n_qa),"ms2_sel")
srcs = c(list(n_ms1_rt,n_ms1_int),as.list(n_qa),as.list(n_ms2_sel))
the_row <- tabkey(summ,kvals=kvals)
the_row = tabkey(summ,kvals=kvals)
summ[the_row,(tgts):=..srcs]
summ[,an.1:=NULL] #FIXME: an.1 pops up somewhere.
qflg <- QA_FLAGS[!(QA_FLAGS %in% "qa_pass")]
summ[,scan.1:=NULL]
qflg = QA_FLAGS[!(QA_FLAGS %in% "qa_pass")]
summ[the_row,qa_pass:=apply(.SD,1,all),.SDcols=qflg]
summ
}
get_mprop_ms2_metadata <- function(ltab_entry) {
res <- list(rt=NA_real_,int=NA_real_,qa=character(0),ms2_sel=F)
res = list(rt=NA_real_,int=NA_real_,qa=character(0),ms2_sel=F)
if (NROW(ltab_entry)==0L) return(res)
res$rt = ltab_entry$ms1_rt
res$int = ltab_entry$ms1_int
z <- ltab_entry[.SD,.SDcols=patterns("qa_ms[12].*")]
lqa_vals <- as.list(ltab_entry[,.SD,.SDcols=patterns("qa_ms[12].*")])
qa_names <- names(lqa_vals)
res$qa <- qa_names[as.logical(lqa_vals)]
z = ltab_entry[.SD,.SDcols=patterns("qa_ms[12].*")]
lqa_vals = as.list(ltab_entry[,.SD,.SDcols=patterns("qa_ms[12].*")])
qa_names = names(lqa_vals)
res$qa = qa_names[as.logical(lqa_vals)]
res$ms2_sel = ltab_entry$ms2_sel
res
......
......@@ -564,7 +564,7 @@ mk_shinyscreen_server <- function(projects,init) {
res = if (NROW(sel_dt)>0) {
tab[sel_dt,..coln,on=c("adduct","tag","ID","an")]
tab[sel_dt,..coln,on=c("adduct","tag","ID","scan")]
} else triv
data.table::setnames(res,"intensity","ms2_int")
res
......@@ -700,8 +700,6 @@ mk_shinyscreen_server <- function(projects,init) {
## REACTIVE FUNCTIONS: COMPOUND INDEX
rf_get_cindex <- reactive({
## TODO: FIXME: Uncomment after rearranging everything.
## input$cmt_changes_b
rvs$status$is_qa_stat
s1 = input$sort1
s2 = input$sort2
......@@ -741,8 +739,8 @@ mk_shinyscreen_server <- function(projects,init) {
rf_get_cindex_parents <- reactive({
rvs$m
isolate({
ms1 = rvs$m$extr$ms1
ms2 = rvs$m$extr$ms2
ms1 = rvs$m$db$extr$cgm$ms1
ms2 = rvs$m$db$extr$cgm$ms2
summ = req(rvs$m$out$tab$summ)
})
......@@ -760,7 +758,8 @@ mk_shinyscreen_server <- function(projects,init) {
kvals = req(rf_get_cindex_kval())
ptab = rf_get_cindex_parents()
if (isTruthy(parent)) {
get_summ_subset(summ=summ,
get_summ_subset(db=rvs$m$db,
summ=summ,
ptab=ptab,
paritem=parent,
kvals=kvals)
......@@ -813,20 +812,23 @@ mk_shinyscreen_server <- function(projects,init) {
rf_plot_eic_ms1 <- reactive({
isolate({
ms1 = rvs$m$extr$ms1
ms1 = rvs$m$db$extr$cgm$ms1
summ = rvs$m$out$tab$summ
db = rvs$m$db
})
req(NROW(summ)>0L)
req(NROW(ms1)>0L)
p = make_eic_ms1_plot(ms1,summ,kvals=rf_get_cindex_kval(),
labs=rf_get_cindex_labs(),
asp=PLOT_EIC_ASPECT,
rt_range=rf_get_rtrange(),
i_range=rf_get_irange(),
colrdata = rf_colrdata())
p = make_eic_ms1_plot(db=db,
summ,
kvals=rf_get_cindex_kval(),
labs=rf_get_cindex_labs(),
asp=PLOT_EIC_ASPECT,
rt_range=rf_get_rtrange(),
i_range=rf_get_irange(),
colrdata = rf_colrdata())
p = if (!is.null(p)) p else empty_plot("Nothing to plot")
......@@ -851,12 +853,13 @@ mk_shinyscreen_server <- function(projects,init) {
gg = rf_plot_eic_ms1()
rt_rng = range(gg$data$rt)
p = make_eic_ms2_plot(summ,
kvals=rf_get_cindex_kval(),
labs=rf_get_cindex_labs(),
rt_range = rf_get_ms2_eic_rtrange(),
asp=PLOT_EIC_ASPECT,
colrdata=rf_colrdata())
p = make_eic_ms2_plot(rvs$m$db,
summ,
kvals=rf_get_cindex_kval(),
labs=rf_get_cindex_labs(),
rt_range = rf_get_ms2_eic_rtrange(),
asp=PLOT_EIC_ASPECT,
colrdata=rf_colrdata())
p = if (!is.null(p)) p else empty_plot("Nothing to plot")
......@@ -878,15 +881,15 @@ mk_shinyscreen_server <- function(projects,init) {
rf_plot_spec_ms2 <- reactive({
isolate({
summ = rvs$m$out$tab$summ
ms2 = rvs$m$extr$ms2
ms2 = rvs$m$db$extr$cgm$ms2
})
req(NROW(summ)>0L)
req(NROW(ms2)>0L)
p = make_spec_ms2_plot(ms2,
summ,
kvals=req(rf_get_cindex_kval()),
labs=req(rf_get_cindex_labs()),
colrdata=rf_colrdata())
p = make_spec_ms2_plot(db = rvs$m$db,
summ,
kvals=req(rf_get_cindex_kval()),
labs=req(rf_get_cindex_labs()),
colrdata=rf_colrdata())
p = if (!is.null(p)) p else empty_plot("Nothing to plot")
p
......@@ -905,29 +908,35 @@ mk_shinyscreen_server <- function(projects,init) {
choices = list.files(path=top_data_dir,
pattern = CMPD_LIST_PATT))
updateSelectInput(session = session,
inputId = "dfile_list",
choices = list.files(path=top_data_dir,
pattern = DFILES_LIST_PATT))
updateSelectInput(session = session,
inputId = "top_data_dir_list",
selected = basename(top_data_dir),
choices = list.dirs(path = init$envopts$top_data_dir,
full.names = F,
recursive = F))
})
observe({
top_data_dir = rvs$gui$paths$data
req(isTruthy(top_data_dir) && dir.exists(top_data_dir))
data_dir = rvs$gui$paths$data
if ((isTruthy(data_dir) && dir.exists(data_dir))) {
dfchoices = list.files(path=data_dir,
pattern = DFILES_LIST_PATT)
updateSelectInput(session = session,
inputId = "dfile_list",
choices = dfchoices)
updateSelectInput(session = session,
inputId = "top_data_dir_list",
selected = basename(rvs$gui$paths$data),
choices = list.dirs(path = init$envopts$top_data_dir,
full.names = F,
recursive = F))
}
updateSelectInput(session = session,
inputId = "dfile_list",
choices = list.files(path=top_data_dir,
pattern = DFILES_LIST_PATT))
})
## Update projects and data directories every second.
observeEvent(rtimer1000(),{
......@@ -981,13 +990,15 @@ mk_shinyscreen_server <- function(projects,init) {
rvs$status$ms2_int_thresh_stat = rvs$m$conf$prescreen[["ms2_int_thresh"]]
rvs$status$s2n_stat = rvs$m$conf$prescreen[["s2n"]]
rvs$status$ret_time_shift_tol_stat = rvs$m$conf$prescreen[["ret_time_shift_tol"]]
if (NROW(m$extr$ms1)>0L) rvs$status$is_extracted_stat = "Yes."
if (NROW(m$db$extr$cgm$ms1)>0L) rvs$status$is_extracted_stat = "Yes."
if (NROW(m$out$tab$summ)>0L) rvs$status$is_qa_stat = "Yes."
} else {
message("Initialising project: ",wd)
rvs$gui = create_gui(project_path=fullwd)
}
message("project: ",rvs$gui$project())
}, label = "project-b")
......@@ -1017,10 +1028,11 @@ mk_shinyscreen_server <- function(projects,init) {
observeEvent(input$sel_data_dir_b,{
data_dir = input$top_data_dir_list
req(isTruthy(data_dir))
rvs$gui$paths$data = file.path(init$envopts$top_data_dir, data_dir)
message("Selected data dir:",rvs$gui$paths$data)
if (isTruthy(data_dir)) {
rvs$gui$paths$data = file.path(init$envopts$top_data_dir, data_dir)
message("Selected data dir:",rvs$gui$paths$data)
}
})
......@@ -1045,32 +1057,9 @@ mk_shinyscreen_server <- function(projects,init) {
observeEvent(input$datafiles_b,{
new_file = input$dfile_list
if (isTruthy(new_file)) {
rvs$gui$filetag = filetag_add_file(rvs$gui$filetag,
new_file)
if (isTruthy(new_file)) rvs$gui$filetag = filetag_add_file(rvs$gui$filetag,
new_file)
## curr_file = rvs$gui$datatab$file
## curr_tag = rvs$gui$datatab$tag
## curr_adduct = rvs$gui$datatab$adduct
## curr_set = rvs$gui$datatab$set
## res_file = union(curr_file,new_file)
## res_adduct = c(curr_adduct,rep(NA_character_,nd))
## res_set = c(curr_set,rep(NA_character_,nd))
## rvs$gui$datatab$file = res_file
## rvs$gui$datatab$tag = add_new_def_tag(as.character(rvs$gui$datatab$tag),nd)
## rvs$gui$datatab$adduct = res_adduct
## rvs$gui$datatab$set = res_set
}
updateSelectInput(session=session,
inputId="dfile_list",
selected=NULL)
})
observeEvent(input$rem_dfiles_b,{
......@@ -1162,7 +1151,7 @@ mk_shinyscreen_server <- function(projects,init) {
rv_extr_flag(F)
rvs$m = run(m=rvs$m,
envopts=init$envopts,
phases=c("setup","comptab","extract"))
phases=c("setup","comptab","db","extract"))
rvs$status$is_extracted_stat = "Yes."
rvs$status$is_qa_stat = "No."
fn_c_state = file.path(rvs$m$run$paths$project,
......@@ -1178,7 +1167,7 @@ mk_shinyscreen_server <- function(projects,init) {
observeEvent(input$presc_b,{
if (NROW(rvs$m$extr$ms1)>0L) {
if (NROW(rvs$m$db$extr$cgm$ms1)>0L) {
## Update just prescreening conf.
rvs$m = app_update_conf(input=input,
gui=rvs$gui,
......@@ -1310,9 +1299,10 @@ mk_shinyscreen_server <- function(projects,init) {
observeEvent(input$make_report_b,{
isolate({
ms1 = rvs$m$extr$ms1
ms2 = rvs$m$extr$ms2
ms1 = rvs$m$db$extr$cgm$ms1
ms2 = rvs$m$db$extr$cgm$ms2
summ = rvs$m$out$tab$summ
db = rvs$m$db
})
req(NROW(summ)>0L)
......@@ -1334,18 +1324,21 @@ mk_shinyscreen_server <- function(projects,init) {
names(kvals) = key
message('Compound index row: ',ri)
p1 = make_eic_ms1_plot(ms1,summ,kvals=kvals,
labs=labs,
asp=PLOT_EIC_ASPECT,
rt_range=rt_range,
i_range=i_range,
colrdata = colrdata) + theme_print()
p2 = make_eic_ms2_plot(summ,kvals=kvals,
labs=labs,
asp=PLOT_EIC_ASPECT,
rt_range=rt_range,
colrdata = colrdata) + theme_print()
p1 = make_eic_ms1_plot(db=db,
summ,
kvals=kvals,
labs=labs,
asp=PLOT_EIC_ASPECT,
rt_range=rt_range,
i_range=i_range,
colrdata = colrdata) + theme_print()
p2 = make_eic_ms2_plot(db=db,
summ,kvals=kvals,
labs=labs,
asp=PLOT_EIC_ASPECT,
rt_range=rt_range,
colrdata = colrdata) + theme_print()
......@@ -1353,11 +1346,11 @@ mk_shinyscreen_server <- function(projects,init) {
smi = rvs$m$out$tab$comp[ID==(id),SMILES][[1]]
p_struc = make_struct_plot(smi)
p_spec = make_spec_ms2_plot(ms2,
summ,
kvals=kvals,
labs=labs,
colrdata=colrdata)+theme_print()
p_spec = make_spec_ms2_plot(db=db,
summ,
kvals=kvals,
labs=labs,
colrdata=colrdata)+theme_print()
cmb = combine_plots(p1,p2,p_spec,p_struc)
print(cmb)
......@@ -1377,7 +1370,7 @@ mk_shinyscreen_server <- function(projects,init) {
fn = file.path(projdir,input$ms2_spectra_tab_name)
shinymsg(paste0("Saving MS2 spectra table to: ",basename(fn)))
tab2file(pack_ms2_w_summ(rvs$m$out$tab$summ,
rvs$m$extr$ms2),
rvs$m$db$extr$spectra),
fn)
shinymsg("Done saving MS2 spectra table.")
})
......@@ -1588,7 +1581,7 @@ mk_shinyscreen_server <- function(projects,init) {
})
output$comp_table = DT::renderDataTable({
## TODO FIXME
## TODO
## cmpds = rf_get_cmpd_tab()
## validate(need(NROW(cmpds)>0,"No compound list loaded yet."))
## DT::datatable(cmpds,
......@@ -1626,8 +1619,6 @@ mk_shinyscreen_server <- function(projects,init) {
m=rvs$m)
kv = rf_get_cindex_kval()
## Some cols that might be needed to be specified
## explicitely. FIXME TODO: Make this more robust.
nsumm = mf_narrow_summ(rvs$m$out$tab$summ,kv,
ms2_rt_i=input$mf_entry_rt_min,
ms2_rt_f=input$mf_entry_rt_max)
......@@ -1639,7 +1630,8 @@ mk_shinyscreen_server <- function(projects,init) {
path = rvs$m$run$metfrag$path,
subpaths = rvs$m$run$metfrag$subpaths,
db_file = rvs$m$run$metfrag$db_file,
stag_tab = stagtab, ms2 = rvs$m$extr$ms2,
stag_tab = stagtab,
ms2 = rvs$m$db$extr$spectra,
runtime=rvs$m$run$metfrag$runtime,
java_bin=rvs$m$run$metfrag$java_bin,
nproc = rvs$m$conf$metfrag$nproc)
......@@ -1768,7 +1760,7 @@ mk_shinyscreen_server <- function(projects,init) {
selMS2 = req(input$sel_spec)
if (NROW(ms2tabsel)!=0L) {
lval = lapply(ms2tabsel[item==(selMS2)],function(x) x)
ms2 = rvs$m$extr$ms2
ms2 = rvs$m$db$extr$cgm$ms2
kval = rf_get_cindex_kval()
allval = c(kval,lval)
## There can be some duplicates.
......@@ -1778,10 +1770,11 @@ mk_shinyscreen_server <- function(projects,init) {
#more than the names existing in extr$ms2. Also,
#BASE_KEY_MS2 does not contain `an', so we need to readd
#it.
key = unique(c(names(allval)[names(allval) %in% BASE_KEY_MS2],"an"))
key = unique(c(names(allval)[names(allval) %in% BASE_KEY_MS2],"scan"))
kval2 = allval[key]
spec = get_data_from_key(ms2,kval2)[,.(mz,intensity)]
## as.character(lapply(1L:NROW(spec),function(nr) paste0(spec[nr,mz]," ",spec[nr,intensity])))
## Only precid and scan can be used for selection in spectra.
kval2 = as.data.table(kval2[c("precid","scan")])
spec = rvs$m$db$extr$spectra[kval2,.(mz,intensity),on=.(precid,scan)]
print(as.data.frame(spec),row.names=F)
} else {
......
......@@ -37,13 +37,12 @@ new_state <- function() {
runtime_from_conf <- function(run,envopts,conf) {
lst_cmpl <- conf$compounds$lists
lst_fn_cmpl <- lapply(names(lst_cmpl),function (nm) {
bfn_cmpl <- lst_cmpl[[nm]]
lst_fn_cmpl <- lapply(lst_cmpl,function (lst) {
bfn_cmpl <- lst
fn <- file.path(run$paths$project,bfn_cmpl)
if (!file.exists(fn)) stop("File ", fn, " does not exist in ", run$paths$project," .")
fn
})
names(lst_fn_cmpl) <- names(lst_cmpl)
run$paths$compounds$lists <- lst_fn_cmpl
run$paths$data = norm_path(file.path(envopts$top_data_dir,conf$paths$data))
......@@ -95,7 +94,7 @@ new_runtime_state <- function(project,envopts,conf=NULL) {
## If MetFrag possible, set up the file structure.
if (metfrag$cando_metfrag) {
mfdir = file.path(project_path,"metfrag")
dir.create(mfdir)
dir.create(mfdir, showWarnings = F)
metfrag$path=mfdir
subpaths = list(results = "results",
config = "config",
......@@ -204,7 +203,7 @@ new_project <- function(project,envopts,datatab=NULL,conf=NULL) {
check_file_absent(fn_conf,what="conf-file")
yaml::yaml.load_file(fn_conf)
} else conf
m$conf$compounds$lists = label_cmpd_lists(m$conf$compounds$lists)
## m$conf$compounds$lists = label_cmpd_lists(m$conf$compounds$lists)
m$run = new_runtime_state(project,envopts=envopts,conf=m$conf)
if (!is.null(datatab)) {
m$input$tab$mzml = datatab
......@@ -409,9 +408,11 @@ pack_ms2_w_summ <- function(summ,ms2) {
## Takes summ, finds entries with high quality spectra and subsets ms2 based on that.
## Take the columns we need from summ.
x = summ[ms2_sel==T,.SD,.SDcols=c(key(summ),"mz","SMILES","Formula","Name")]
mrg_keys = c(intersect(key(ms2),key(summ)),"an")
ms2[x,.(mz=i.mz,ms2_spectrum=encode_ms2_to_line(.SD[,c("mz","intensity")])),on=mrg_keys,by=.EACHI]
x = summ[ms2_sel==T,.SD,.SDcols=c(key(summ),"mz","SMILES","Name")]
mrg_keys = intersect(key(ms2),key(summ))
mrg_keys = c(mrg_keys,"scan")
## ms2[x,.(mz=i.mz,ms2_spectrum=encode_ms2_to_line(.SD[,c("mz","intensity")])),on=mrg_keys,by=.EACHI]
ms2[x,.(ion_mz=i.mz,mz,intensity),on=mrg_keys,by=.EACHI]
}
......@@ -431,3 +432,40 @@ pack_project <- function(m,fn_arch) {
})
fn_arch
}
join_compound_lists <- function(fname) {
coll <- list()
fields <- colnames(EMPTY_CMPD_LIST)
coltypes <- c(ID="character",
SMILES="character",
Formula="character",
Name="character",
RT="numeric",
mz="numeric")
l=0L
for (fn in fname) {
l = l + 1L
## Figure out column headers.
nms <- colnames(file2tab(fn,nrows=0))
## Read the table. Knowing column headers prevents unnecessary
## warnings.
dt <- file2tab(fn, colClasses=coltypes[nms])
verify_cmpd_l(dt=dt,fn=fn)
# nonexist <- setdiff(fnfields,fields)
coll[[l]] <- dt #if (length(nonexist)==0) dt else dt[,(nonexist) := NULL]
coll[[l]]$ORIG <- fn
}
if (length(fname)>0) rbindlist(l=c(list(EMPTY_CMPD_LIST), coll), use.names = T, fill = T) else EMPTY_CMPD_LIST
}
process_cmpd_sets <- function(cmpdlist) {
if (nrow(cmpdlist)==0L) return(EMPTY_CMPD_LIST)
## Process sets.
if (! ("set" %in% colnames(cmpdlist))) cmpdlist$set=NA_character_ else cmpdlist[,set:=as.character(set)]
slugs = cmpdlist[,.(slug=gen_fname_slug(ORIG)),by="ORIG"]
slugs[,slug:=uniqy_slugs(slug)]
cmpdlist[slugs,set:=fifelse(is.na(set),i.slug,set),on="ORIG"]
cmpdlist
}