# COnstraint-Based Reconstruction and EXascale Analysis
[docs-img-stable]: https://img.shields.io/badge/docs-stable-blue.svg
[docs-url-stable]: https://lcsb-biocore.github.io/COBREXA.jl
[docs-img-dev]: https://img.shields.io/badge/docs-latest-0af.svg
[docs-url-dev]: https://lcsb-biocore.github.io/COBREXA.jl/dev/
[ci-img]: https://github.com/LCSB-BioCore/COBREXA.jl/actions/workflows/ci.yml/badge.svg?branch=master
[ci-url]: https://github.com/LCSB-BioCore/COBREXA.jl/actions/workflows/ci.yml
[cov-img]: https://codecov.io/gh/LCSB-BioCore/COBREXA.jl/branch/master/graph/badge.svg?token=H3WSWOBD7L
[cov-url]: https://codecov.io/gh/LCSB-BioCore/COBREXA.jl
[contrib-img]: https://img.shields.io/badge/contributions-start%20here-green
[contrib-url]: https://github.com/LCSB-BioCore/COBREXA.jl/blob/master/.github/CONTRIBUTING.md
[repostatus-url]: https://www.repostatus.org/#active
[repostatus-img]: https://www.repostatus.org/badges/latest/active.svg
| **Documentation** | **Tests** | **Coverage** | **How to contribute?** | **Project status** |
|:--------------:|:-------:|:---------:|:---------:|:---------:|
| [![docs-img-stable]][docs-url-stable] [![docs-img-dev]][docs-url-dev] | [![CI][ci-img]][ci-url] | [![codecov][cov-img]][cov-url] | [![contrib][contrib-img]][contrib-url] | [![repostatus-img]][repostatus-url] |
This is package provides constraint-based reconstruction and analysis tools for
exa-scale metabolic models in Julia.
## How to get started
### Prerequisites and requirements
- **Operating system**: Use Linux (Debian, Ubuntu or centOS), MacOS, or Windows
10 as your operating system. `COBREXA` has been tested on these systems.
- **Julia language**: In order to use `COBREXA`, you need to install Julia 1.0
or higher. Download and follow the installation instructions for Julia
[here](https://julialang.org/downloads/).
- **Hardware requirements**: `COBREXA` runs on any hardware that can run Julia,
and can easily use resources from multiple computers interconnected on a
network. For processing large datasets, you are required to ensure that the
total amount of available RAM on all involved computers is larger than the
data size.
- **Optimization solvers**: `COBREXA` uses
[`JuMP.jl`](https://github.com/jump-dev/JuMP.jl) to formulate optimization
problems and is compatible with all [`JuMP` supported
solvers](https://jump.dev/JuMP.jl/stable/installation/#Supported-solvers).
However, to perform analysis at least one of these solvers needs to be
installed on your machine. For a pure Julia implementation, you may use e.g.
[`Tulip.jl`](https://github.com/ds4dm/Tulip.jl), but other solvers (GLPK,
Gurobi, ...) work just as well.
:bulb: If you are new to Julia, it is advisable to [familiarize yourself with
the environment
first](https://docs.julialang.org/en/v1/manual/getting-started/). Use the
Julia [documentation](https://docs.julialang.org) to solve various
language-related issues, and the [Julia package manager
docs](https://julialang.github.io/Pkg.jl/v1/getting-started/) to solve
installation-related difficulties. Of course, [the Julia
channel](https://discourse.julialang.org/) is another fast and easy way to find
answers to Julia specific questions.
### Quick start guide
You can install COBREXA from Julia repositories. Start `julia`, **press `]`** to
switch to the Packaging environment, and type:
```
add COBREXA
```
You also need to install your favorite solver supported by `JuMP.jl`, typing
e.g.:
```
add Tulip
```
When the packages are installed, switch back to the "normal" julia shell by
pressing Backspace (the prompt should change color back to green). After that,
you can download [a SBML model from the
internet](http://bigg.ucsd.edu/models/e_coli_core) and perform a
flux balance analysis as follows:
```julia
using COBREXA # loads the package
using Tulip # loads the optimization solver
# download the model
download("http://bigg.ucsd.edu/static/models/e_coli_core.xml", "e_coli_core.xml")
# open the SBML file and load the contents
model = load_model("e_coli_core.xml")
# run a FBA
fluxes = flux_balance_analysis_dict(model, Tulip.Optimizer)
```
The variable `fluxes` will now contain a dictionary of the computed optimal
flux of each reaction in the model:
```
Dict{String,Float64} with 95 entries:
"R_EX_fum_e" => 0.0
"R_ACONTb" => 6.00725
"R_TPI" => 7.47738
"R_SUCOAS" => -5.06438
"R_GLNS" => 0.223462
"R_EX_pi_e" => -3.2149
"R_PPC" => 2.50431
"R_O2t" => 21.7995
"R_G6PDH2r" => 4.95999
"R_TALA" => 1.49698
⋮ => ⋮
```
#### Model variant processing
The main feature of COBREXA.jl is the ability to easily specify and process
many analyses at once, in parallel. Let's see how the organism would perform if
some reactions were disabled:
```julia
# convert to a model type that is easy to modify
m = convert(StandardModel, m)
# find the model objective value if oxygen and carbon dioxide transports are disabled
screen(m,
variants=[
[], # no modifications
[with_changed_bound("O2t", lower=0.0, upper=0.0)], # disable oxygen
[with_changed_bound("CO2t", lower=0.0, upper=0.0)], # disable CO2
[with_changed_bound("O2t", lower=0.0, upper=0.0),
with_changed_bound("CO2t", lower=0.0, upper=0.0)], # disable both
],
analysis = x -> flux_balance_analysis_dict(x, Tulip.Optimizer)["BIOMASS_Ecoli_core_w_GAM"],
)
```
You should receive a result showing that missing oxygen transport makes the
biomass production much harder:
```julia
4-element Vector{Float64}:
0.8739215022674809
0.21166294973372796
0.46166961413944896
0.21114065173865457
```
Most importantly, such analyses can be easily specified by automatically
generating long lists of the modifications to apply to the model, and
parallelized:
```julia
# load the task distribution package, add several worker nodes, and load
# COBREXA and the solver on the nodes
using Distributed
addprocs(4)
@everywhere using COBREXA, Tulip
# get a list of the workers
worker_list = workers()
# run the processing in parallel for many model variants
res = screen(m,
variants=[
# specify one variant for each reaction in the model, with that reaction knocked out
[with_changed_bound(reaction_id, lower=0.0, upper=0.0)]
for reaction_id in reactions(m)
],
analysis = model -> begin
# we need to check if the model even found a feasible solution, which
# may not be the case if we knock out important reactions
sol = flux_balance_analysis_dict(model, Tulip.Optimizer)
isnothing(sol) ? nothing : sol["BIOMASS_Ecoli_core_w_GAM"]
end,
# run the screening in parallel on all workers from the list
workers = worker_list,
)
```
In result, you should get a long list of the biomass production for each
reaction knockout. Let's decorate it with reaction names:
```julia
Dict(reactions(m) .=> res)
```
...which should output an easily accessible dictionary with all the objective
values named, giving a quick overview of which reactions are critical for the
model organism to create biomass:
```julia
Dict{String, Union{Nothing, Float64}} with 95 entries:
"ACALD" => 0.873922
"PTAr" => 0.873922
"ALCD2x" => 0.873922
"PDH" => 0.796696
"PYK" => 0.864926
"CO2t" => 0.46167
"EX_nh4_e" => 1.44677e-15
"MALt2_2" => 0.873922
"CS" => 2.44779e-14
"PGM" => 1.04221e-15
"TKT1" => 0.864759
⋮ => ⋮
```
### Testing the installation
If you run a non-standard platform (e.g. a customized operating system), or if
you added any modifications to the `COBREXA` source code, you may want to run
the test suite to ensure that everything works as expected:
```julia
] test COBREXA
```
## Acknowledgements
`COBREXA.jl` is developed at the Luxembourg Centre for Systems Biomedicine of
the University of Luxembourg ([uni.lu/lcsb](https://wwwen.uni.lu/lcsb)),
cooperating with the Institute for Quantitative and Theoretical Biology at the Heinrich
Heine University in Düsseldorf ([qtb.hhu.de](https://www.qtb.hhu.de/)).
The development was supported by European Union's Horizon 2020 Programme under
PerMedCoE project ([permedcoe.eu](https://permedcoe.eu/)) agreement no. 951773.