index.html 32.1 KB
 Documenter.jl committed May 11, 2021 1   Documenter.jl committed Nov 05, 2021 2 Basic usage of StandardModel · COBREXA.jl

Basic usage of StandardModel

In this tutorial we will use COBREXA's StandardModel and functions that specifically operate on it. As usual we will use the toy model of E. coli for demonstration.

!isfile("e_coli_core.json") && 
Documenter.jl committed May 11, 2021  3      4                                                                                                                                                                                                                                                                                                 download("http://bigg.ucsd.edu/static/models/e_coli_core.json", "e_coli_core.json")  
Documenter.jl committed May 11, 2021  5                                                                                                                                                                                                                                                                                                    using COBREXA

model = load_model(StandardModel, "e_coli_core.json") # we specifically want to load a StandardModel from the model file
Metabolic model of type StandardModel 
Documenter.jl committed Dec 01, 2021  6                                                                                                                                                                                                                                                                                                    sparse([9, 51, 55, 64, 65, 34, 44, 59, 66, 64  …  20, 22, 23, 25, 16, 17, 34, 44, 57, 59], [1, 1, 1, 1, 1, 2, 2, 2, 2, 3  …  93, 93, 94, 94, 95, 95, 95, 95, 95, 95], [1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0  …  1.0, -1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0], 72, 95) 
Documenter.jl committed May 11, 2021  7      8                                                                                                                                                                                                                                                                                             Number of reactions: 95 Number of metabolites: 72 
Documenter.jl committed May 11, 2021  9      10                                                                                                                                                                                                                                                                                            
Note: Loading StandardModels implicitly uses convert

When using load_model(StandardModel, file_location) the model at file_location is first loaded into its inferred format and is then converted to a StandardModel using the generic accessor interface. Thus, data loss may occur. Always check your model to ensure that nothing important has been lost.

Basic analysis

As before, for optimization based analysis we need to load an optimizer. Here we will use Tulip.jl to solve the linear programs of this tutorial. Refer to the basic constraint-based analysis tutorial for more informaiton.

All the normal analysis functions work on StandardModel, due to it also having the same generic accessor interface as all the other model types.

using Tulip  
Documenter.jl committed Jun 11, 2021  11                                                                                                                                                                                                                                                                                                   fluxes = flux_balance_analysis_dict( 
Documenter.jl committed May 11, 2021  12      13      14      15                                                                                                                                                                                                                                                                               model,     Tulip.Optimizer;     modifications = [         change_objective("BIOMASS_Ecoli_core_w_GAM"), 
Documenter.jl committed Aug 16, 2021  16      17                                                                                                                                                                                                                                                                                                   change_constraint("EX_glc__D_e"; lb = -12, ub = -12),         change_constraint("EX_o2_e"; lb = 0, ub = 0), 
Documenter.jl committed May 11, 2021  18      19      20      21      22      23      24      25      26      27      28      29      30      31      32      33      34      35      36      37      38                                                                                                                                       ], )
Dict{String, Float64} with 95 entries:   "ACALD"       => -9.78427   "PTAr"        => 10.0729   "ALCD2x"      => -9.78427   "PDH"         => 1.98381e-9   "PYK"         => 9.94501   "CO2t"        => 0.487021   "EX_nh4_e"    => -1.48633   "MALt2_2"     => -0.0   "CS"          => 0.294088   "PGM"         => -22.8676   "TKT1"        => -0.0487648   "EX_mal__L_e" => -0.0   "ACONTa"      => 0.294088   "EX_pi_e"     => -1.00274   "GLNS"        => 0.069699   "ICL"         => 5.34932e-11   "EX_o2_e"     => -0.0   "FBA"         => 11.7289   "EX_gln__L_e" => -0.0 
Documenter.jl committed Nov 02, 2021  39                                                                                                                                                                                                                                                                                                     ⋮             => ⋮

This is not very exciting yet, since every other model type can also do this. However, deeper inspection of flux results is possible when using StandardModel.

Inspecting the flux solution: atom_fluxes

It is sometimes interesting to keep track of the atoms entering and leaving the system through boundary reactions. This can be inspected by calling atom_fluxes. That gives you the flux of individual atoms getting consumed and produced by all reactions, based on fluxes. We erase the reaction that consumes the atoms for creating biomass, to see how much mass the "rest" of the reaction produces for it:

fluxes_without_biomass = copy(fluxes); 
Documenter.jl committed Jun 11, 2021  40                                                                                                                                                                                                                                                                                                   delete!(fluxes_without_biomass, "BIOMASS_Ecoli_core_w_GAM"); 
Documenter.jl committed Aug 05, 2021  41                                                                                                                                                                                                                                                                                                   atom_fluxes(model, fluxes_without_biomass)
Dict{String, Float64} with 6 entries: 
Documenter.jl committed Jun 11, 2021  42                                                                                                                                                                                                                                                                                                     "S" => 0.0 
Documenter.jl committed Jun 08, 2021  43      44      45      46                                                                                                                                                                                                                                                                             "C" => 11.5998   "N" => 1.48633   "P" => 1.00274   "H" => 20.7086 
Documenter.jl committed Nov 02, 2021  47                                                                                                                                                                                                                                                                                                     "O" => 12.995

Inspecting the flux solution: metabolite_fluxes

Another useful flux result analysis function is metabolite_fluxes. This function gives an overview of reactions consuming and producing each metabolite.

consuming, producing = metabolite_fluxes(model, fluxes) 
Documenter.jl committed May 11, 2021  48                                                                                                                                                                                                                                                                                                    
Documenter.jl committed Aug 09, 2021  49                                                                                                                                                                                                                                                                                                   consuming["atp_c"] # reactions consuming atp_c
Dict{String, Float64} with 8 entries: 
Documenter.jl committed May 11, 2021  50      51      52      53                                                                                                                                                                                                                                                                             "PFK"                      => -11.7289   "BIOMASS_Ecoli_core_w_GAM" => -16.3031   "GLNS"                     => -0.069699   "ATPM"                     => -8.39 
Documenter.jl committed Aug 09, 2021  54      55      56                                                                                                                                                                                                                                                                                     "ADK1"                     => -1.66663e-10   "PPCK"                     => -5.66231e-11   "PPS"                      => -1.66662e-10 
Documenter.jl committed May 11, 2021  57                                                                                                                                                                                                                                                                                                     "ATPS4r"                   => -6.80168

Internals of StandardModel

Another benefit of StandardModel is that it supports a richer internal infrastructure that can be used to manipulate internal model attributes in a systematic way. Specifically, the genes, reactions, and metabolites with of a model each have a type. This is particularly useful when modifying or even constructing a model from scratch.

Genes, Reactions, and Metabolites

StandardModel is composed of ordered dictionaries of Genes, Metabolites and Reactions. Ordered dictionaries are used because the order of the reactions and metabolites are important for constructing a stoichiometric matrix since the rows and columns should correspond to the order of the metabolites and reactions returned by calling the accessors metabolites and reactions.

Each StandardModel is composed of the following fields:

fieldnames(StandardModel) # fields of a StandardModel
(:id, :reactions, :metabolites, :genes)

The :genes field of a StandardModel contains an ordered dictionary of gene ids mapped to Genes.

model.genes # the keys of this dictionary are the same as genes(model)
OrderedCollections.OrderedDict{String, Gene} with 137 entries: 
Documenter.jl committed Jul 27, 2021  58      59      60      61      62      63      64      65      66      67      68      69      70      71      72      73      74      75      76      77                                                                                                                                             "b1241" => Gene("b1241", Dict("original_bigg_ids"=>["b1241"]), Dict("sbo"=>["…   "b0351" => Gene("b0351", Dict("original_bigg_ids"=>["b0351"]), Dict("sbo"=>["…   "s0001" => Gene("s0001", Dict("original_bigg_ids"=>["s0001"]), Dict("sbo"=>["…   "b1849" => Gene("b1849", Dict("original_bigg_ids"=>["b1849"]), Dict("sbo"=>["…   "b3115" => Gene("b3115", Dict("original_bigg_ids"=>["b3115"]), Dict("sbo"=>["…   "b2296" => Gene("b2296", Dict("original_bigg_ids"=>["b2296"]), Dict("sbo"=>["…   "b1276" => Gene("b1276", Dict("original_bigg_ids"=>["b1276"]), Dict("sbo"=>["…   "b0118" => Gene("b0118", Dict("original_bigg_ids"=>["b0118"]), Dict("sbo"=>["…   "b0474" => Gene("b0474", Dict("original_bigg_ids"=>["b0474"]), Dict("sbo"=>["…   "b0116" => Gene("b0116", Dict("original_bigg_ids"=>["b0116"]), Dict("sbo"=>["…   "b0727" => Gene("b0727", Dict("original_bigg_ids"=>["b0727"]), Dict("sbo"=>["…   "b0726" => Gene("b0726", Dict("original_bigg_ids"=>["b0726"]), Dict("sbo"=>["…   "b2587" => Gene("b2587", Dict("original_bigg_ids"=>["b2587"]), Dict("sbo"=>["…   "b0356" => Gene("b0356", Dict("original_bigg_ids"=>["b0356"]), Dict("sbo"=>["…   "b1478" => Gene("b1478", Dict("original_bigg_ids"=>["b1478"]), Dict("sbo"=>["…   "b3734" => Gene("b3734", Dict("original_bigg_ids"=>["b3734"]), Dict("sbo"=>["…   "b3733" => Gene("b3733", Dict("original_bigg_ids"=>["b3733"]), Dict("sbo"=>["…   "b3736" => Gene("b3736", Dict("original_bigg_ids"=>["b3736"]), Dict("sbo"=>["…   "b3737" => Gene("b3737", Dict("original_bigg_ids"=>["b3737"]), Dict("sbo"=>["…   ⋮       => ⋮

The Gene type is a struct that can be used to store information about genes in a StandardModel. Each Gene is composed of the following fields:

fieldnames(Gene)
(:id, :notes, :annotations)
Tip: Use <tab> complete to explore the structure of types

Use <tab> to quickly explore the fields of a struct. For example, Gene.<tab> will list all the fields shown above.

The keys used in the ordered dictionaries in model.genes are the ids returned using the generic accessor genes. Genes have pretty printing, as demonstrated below for a random gene drawn from the model:

random_gene_id = genes(model)[rand(1:n_genes(model))] 
Documenter.jl committed Dec 01, 2021  78                                                                                                                                                                                                                                                                                                   model.genes[random_gene_id]
Gene.id: b3114 
Documenter.jl committed May 11, 2021  79                                                                                                                                                                                                                                                                                                   Gene.notes:  
Documenter.jl committed Dec 01, 2021  80                                                                                                                                                                                                                                                                                                   	original_bigg_ids: ["b3114"] 
Documenter.jl committed May 11, 2021  81      82                                                                                                                                                                                                                                                                                           Gene.annotations:  	sbo: ["SBO:0000243"] 
Documenter.jl committed Dec 01, 2021  83      84      85      86      87      88      89      90                                                                                                                                                                                                                                           	uniprot: ["P42632"] 	ecogene: ["EG12758"] 	ncbigene: ["947623"] 	ncbigi: ["49176316"] 	refseq_locus_tag: ["b3114"] 	refseq_name: ["tdcE"] 	asap: ["ABE-0010242"] 	refseq_synonym: JW5522, ..., yhaS 
Documenter.jl committed May 11, 2021  91                                                                                                                                                                                                                                                                                                   

The same idea holds for both metabolites (stored as Metabolites) and reactions (stored as Reactions). This is demonstrated below.

random_metabolite_id = metabolites(model)[rand(1:n_metabolites(model))] 
Documenter.jl committed Dec 01, 2021  92      93      94      95                                                                                                                                                                                                                                                                           model.metabolites[random_metabolite_id]
Metabolite.id: nh4_c Metabolite.formula: N1H4 Metabolite.charge: 1 Metabolite.compartment: c 
Documenter.jl committed May 11, 2021  96                                                                                                                                                                                                                                                                                                   Metabolite.notes:  
Documenter.jl committed Dec 01, 2021  97                                                                                                                                                                                                                                                                                                   	original_bigg_ids: ["nh4_c"] 
Documenter.jl committed May 11, 2021  98                                                                                                                                                                                                                                                                                                   Metabolite.annotations:  
Documenter.jl committed Dec 01, 2021  99      100      101                                                                                                                                                                                                                                                                                 	envipath: 650babc9-9d68-4b73-9... 	kegg.drug: ["D02916", "D02915"] 	kegg.compound: ["C00014", "C01342"] 
Documenter.jl committed May 11, 2021  102                                                                                                                                                                                                                                                                                                  	sbo: ["SBO:0000247"] 
Documenter.jl committed Dec 01, 2021  103      104      105      106      107      108      109      110      111                                                                                                                                                                                                                          	sabiork: ["1268", "43"] 	biocyc: ["META:AMMONIUM", "META:AMMONIA"] 	chebi: CHEBI:44284, ..., CHEBI:13407 	metanetx.chemical: ["MNXM15"] 	inchi_key: QGZKDVFQNNGYKY-UHFFF... 	hmdb: ["HMDB00051", "HMDB41827"] 	bigg.metabolite: ["nh4"] 	seed.compound: ["cpd19013", "cpd00013"] 	reactome.compound: 5693978, ..., 113561 
Documenter.jl committed May 11, 2021  112                                                                                                                                                                                                                                                                                                  
random_reaction_id = reactions(model)[rand(1:n_reactions(model))] 
Documenter.jl committed Dec 01, 2021  113      114                                                                                                                                                                                                                                                                                         model.reactions[random_reaction_id]
Reaction.id: ICL Reaction.metabolites: 1.0 icit_c →  1.0 succ_c + 1.0 glx_c 
Documenter.jl committed Jun 21, 2021  115                                                                                                                                                                                                                                                                                                  Reaction.lb: 0.0 
Documenter.jl committed May 11, 2021  116                                                                                                                                                                                                                                                                                                  Reaction.ub: 1000.0 
Documenter.jl committed Dec 01, 2021  117      118                                                                                                                                                                                                                                                                                         Reaction.grr: (b4015) Reaction.subsystem: Anaplerotic reactions 
Documenter.jl committed May 11, 2021  119                                                                                                                                                                                                                                                                                                  Reaction.notes:  
Documenter.jl committed Dec 01, 2021  120                                                                                                                                                                                                                                                                                                  	original_bigg_ids: ["ICL"] 
Documenter.jl committed May 11, 2021  121                                                                                                                                                                                                                                                                                                  Reaction.annotations:  
Documenter.jl committed Dec 01, 2021  122      123      124      125      126      127      128      129      130                                                                                                                                                                                                                          	bigg.reaction: ["ICL"] 	sabiork: ["911"] 	metanetx.reaction: ["MNXR100789"] 	rhea: 13248, ..., 13247 	sbo: ["SBO:0000176"] 	seed.reaction: ["rxn00336"] 	kegg.reaction: ["R00479"] 	ec-code: ["4.1.3.1"] Reaction.objective_coefficient: 0.0 
Documenter.jl committed May 11, 2021  131      132      133      134      135      136      137      138      139      140      141      142      143      144      145      146      147      148      149      150      151      152      153      154      155      156      157      158      159      160      161      162      163  

StandardModel can be used to build your own metabolic model or modify an existing one. One of the main use cases for StandardModel is that it can be used to merge multiple models or parts of multiple models together. Since the internals are uniform inside each StandardModel, attributes of other model types are squashed into the required format (using the generic accessors). This ensures that the internals of all StandardModels are the same - allowing easy systematic evaluation.

Warning: Combining models with different namespaces is tricky

Combining models that use different namespaces requires care. For example, in some models the water exchange reaction is called EX_h2o_e, while in others it is called R_EX_h2o_s. This needs to manually addressed (for now) to prevent duplicate, e.g. reactions, from being added.

Checking the internals of StandardModels: annotation_index

Often when models are automatically reconstructed duplicate genes, reactions or metabolites end up in a model. COBREXA exports annotation_index to check for cases where the id of a struct may be different, but the annotations the same (possibly suggesting a duplication). annotation_index builds a dictionary mapping annotation features to the ids of whatever struct you are inspecting. This makes it easy to find structs that share certain annotation features.

rxn_annotations = annotation_index(model.reactions)
Dict{String, Dict{String, Set{String}}} with 10 entries:   "ec-code"           => Dict("3.6.3.37"=>Set(["ATPM"]), "3.6.3.42"=>Set(["ATPM…   "sabiork"           => Dict("109"=>Set(["PGL"]), "762"=>Set(["GLUN"]), "155"=…   "metanetx.reaction" => Dict("MNXR104869"=>Set(["TKT2"]), "MNXR99715"=>Set(["E…   "rhea"              => Dict("27626"=>Set(["TKT2"]), "10229"=>Set(["ACONTa"]),…   "sbo"               => Dict("SBO:0000627"=>Set(["EX_for_e", "EX_nh4_e", "EX_p…   "seed.reaction"     => Dict("rxn05297"=>Set(["GLUt2r"]), "rxn09717"=>Set(["PY…   "kegg.reaction"     => Dict("R00114"=>Set(["GLUSy"]), "R00199"=>Set(["PPS"]),…   "biocyc"            => Dict("META:TRANS-RXN-121B"=>Set(["FUMt2_2"]), "META:PE…   "reactome.reaction" => Dict("R-TGU-71397"=>Set(["PDH"]), "R-XTR-70449"=>Set([…   "bigg.reaction"     => Dict("ACALD"=>Set(["ACALD"]), "PTAr"=>Set(["PTAr"]), "…
rxn_annotations["ec-code"]
Dict{String, Set{String}} with 141 entries:   "3.6.3.37" => Set(["ATPM"])   "3.6.3.42" => Set(["ATPM"])   "3.6.3.38" => Set(["ATPM"])   "3.6.3.19" => Set(["ATPM"])   "2.3.3.1"  => Set(["CS"])   "1.6.1.2"  => Set(["NADTRHD"])   "3.6.3.35" => Set(["ATPM"])   "6.2.1.5"  => Set(["SUCOAS"])   "6.3.5.4"  => Set(["GLUN"])   "3.6.3.49" => Set(["ATPM"])   "3.6.3.51" => Set(["ATPM"])   "1.2.1.12" => Set(["GAPD"])   "3.6.3.32" => Set(["ATPM"])   "2.3.3.3"  => Set(["CS"])   "2.7.4.3"  => Set(["ADK1"])   "6.3.5.5"  => Set(["GLUN"])   "3.5.1.2"  => Set(["GLUN"])   "1.1.1.49" => Set(["G6PDH2r"])   "5.3.1.9"  => Set(["PGI"])   ⋮          => ⋮

The annotation_index function can also be used on Reactions and Genes in the same way.

Checking the internals of StandardModels: check_duplicate_reaction

Another useful function is check_duplicate_reaction, which checks for reactions that have duplicate (or similar) reaction equations.

pgm_duplicate = Reaction() pgm_duplicate.id = "pgm2" # Phosphoglycerate mutase pgm_duplicate.metabolites = Dict{String,Float64}("3pg_c" => 1, "2pg_c" => -1) 
Documenter.jl committed May 11, 2021  164                                                                                                                                                                                                                                                                                                  pgm_duplicate
Reaction.id: pgm2 
Documenter.jl committed Aug 02, 2021  165                                                                                                                                                                                                                                                                                                  Reaction.metabolites: 1.0 2pg_c ↔  1.0 3pg_c 
Documenter.jl committed May 11, 2021  166      167      168      169      170      171      172                                                                                                                                                                                                                                            Reaction.lb: -1000.0 Reaction.ub: 1000.0 Reaction.grr: --- Reaction.subsystem: --- Reaction.notes: --- Reaction.annotations: --- Reaction.objective_coefficient: 0.0 
Documenter.jl committed Nov 02, 2021  173                                                                                                                                                                                                                                                                                                  
check_duplicate_reaction(pgm_duplicate, model.reactions; only_metabolites = false) # can also just check if only the metabolites are the same but different stoichiometry is used
"PGM"

Checking the internals of StandardModels: reaction_mass_balanced

Finally, reaction_mass_balanced can be used to check if a reaction is mass balanced based on the formulas of the reaction equation.

rxn_dict = Dict{String,Float64}("3pg_c" => 1, "2pg_c" => -1, "h2o_c" => 1) 
Documenter.jl committed Aug 05, 2021  174      175      176      177                                                                                                                                                                                                                                                                       reaction_mass_balanced(model, rxn_dict)
false

Now to determine which atoms are unbalanced, you can use reaction_atom_balance

reaction_atom_balance(model, rxn_dict)
Dict{String, Float64} with 4 entries:   "C" => 0.0   "P" => 0.0   "H" => 2.0 
Documenter.jl committed Dec 06, 2021  178                                                                                                                                                                                                                                                                                                    "O" => 1.0

Note, since pgm_duplicate is not in the model, we cannot use the other variants of this function because they find the reaction equation stored inside the model.