Gitlab migration complete. If you have any issue please read the FAQ.

Commit 3d09d6e4 authored by Rauschenberger's avatar Rauschenberger
Browse files

automation

parent 49360ff7
Package: spliceQTL
Package: colasso
Version: 0.0.0
Title: Alternative Splicing
Description: Implements test for alternative splicing.
Title: colasso regression
Description: Implements colasso.
Depends: R (>= 3.0.0)
Imports: lme4, globaltest, edgeR, snpStats, refGenome, R.utils, methods, SummarizedExperiment, vcfR
Imports: glmnet, MASS
Suggests: knitr, testthat
Authors@R: c(person("Armin","Rauschenberger",email="a.rauschenberger@vumc.nl",role=c("aut","cre")),
person("Renee","Menezes",role=c("aut")))
Authors@R: person("Armin","Rauschenberger",email="a.rauschenberger@vumc.nl",role=c("aut","cre"))
VignetteBuilder: knitr
License: GPL-3
LazyData: true
RoxygenNote: 6.0.1
URL: https://github.com/rauschenberger/spliceQTL
BugReports: https://github.com/rauschenberger/spliceQTL/issues
URL: https://github.com/rauschenberger/colasso
BugReports: https://github.com/rauschenberger/colasso/issues
# Generated by roxygen2: do not edit by hand
export(adjust.samples)
export(adjust.variables)
export(drop.trivial)
export(get.exons.bbmri)
export(get.exons.geuvadis)
export(get.snps.bbmri)
export(get.snps.geuvadis)
export(grid)
export(map.exons)
export(map.genes)
export(map.snps)
export(match.samples)
export(test.multiple)
export(test.single)
export(visualise)
export(colasso)
export(colasso_compare)
export(colasso_covariate_weights)
export(colasso_marginal_significance)
export(colasso_moderate)
export(colasso_simulate)
export(colasso_weighted_correlation)
This diff is collapsed.
......@@ -13,12 +13,12 @@ knitr::opts_chunk$set(
)
```
[![Travis-CI Build Status](https://travis-ci.org/rauschenberger/spliceQTL.svg)](https://travis-ci.org/rauschenberger/spliceQTL)
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/rauschenberger/spliceQTL?svg=true)](https://ci.appveyor.com/project/rauschenberger/spliceQTL)
[![Travis-CI Build Status](https://travis-ci.org/rauschenberger/spliceQTL.svg)](https://travis-ci.org/rauschenberger/colasso)
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/rauschenberger/spliceQTL?svg=true)](https://ci.appveyor.com/project/rauschenberger/colasso)
```{r,eval=FALSE}
#install.packages("devtools")
devtools::install_github("rauschenberger/spliceQTL")
devtools::install_github("rauschenberger/colasso")
```
......
This diff is collapsed.
......@@ -6,7 +6,7 @@
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Articles • spliceQTL</title>
<title>Articles • colasso</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
......@@ -54,7 +54,7 @@
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">spliceQTL</a>
<a class="navbar-link" href="../index.html">colasso</a>
<span class="label label-default" data-toggle="tooltip" data-placement="bottom" title="Released package">0.0.0</span>
</span>
</div>
......@@ -78,10 +78,7 @@
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/code.html">rainbow (code)</a>
</li>
<li>
<a href="../articles/text.html">rainbow (text)</a>
<a href="../articles/vignette.html">colasso</a>
</li>
</ul>
</li>
......@@ -92,7 +89,7 @@
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/rauschenberger/spliceQTL">
<a href="https://github.com/rauschenberger/colasso">
<span class="fa fa-github fa-lg"></span>
</a>
......@@ -117,8 +114,7 @@
<p class="section-desc"></p>
<ul>
<li><a href="code.html">rainbow (code)</a></li>
<li><a href="text.html">rainbow (text)</a></li>
<li><a href="vignette.html">colasso</a></li>
</ul>
</div>
</div>
......@@ -126,7 +122,7 @@
<footer>
<div class="copyright">
<p>Developed by Armin Rauschenberger, Renee Menezes.</p>
<p>Developed by Armin Rauschenberger.</p>
</div>
<div class="pkgdown">
......
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>rainbow (text) • spliceQTL</title>
<!-- jquery --><script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script><!-- Font Awesome icons --><link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/1.7.1/clipboard.min.js" integrity="sha384-cV+rhyOuRHc9Ub/91rihWcGmMmCXDeksTtCihMupQHSsi8GIIRDG0ThDc3HGQFJ3" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../jquery.sticky-kit.min.js"></script><script src="../pkgdown.js"></script><meta property="og:title" content="rainbow (text)">
<meta property="og:description" content="">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">spliceQTL</a>
<span class="label label-default" data-toggle="tooltip" data-placement="bottom" title="Released package">0.0.0</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fa fa-home fa-lg"></span>
</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
Articles
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/code.html">rainbow (code)</a>
</li>
<li>
<a href="../articles/text.html">rainbow (text)</a>
</li>
</ul>
</li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/rauschenberger/spliceQTL">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>rainbow (text)</h1>
<small class="dont-index">Source: <a href="https://github.com/rauschenberger/spliceQTL/blob/master/vignettes/text.Rmd"><code>vignettes/text.Rmd</code></a></small>
<div class="hidden name"><code>text.Rmd</code></div>
</div>
<p>This vignette describes the data analysis. All functions are included in the R package <code>spliceQTL</code>.</p>
<div id="data" class="section level2">
<h2 class="hasAnchor">
<a href="#data" class="anchor"></a>1. Data</h2>
<div id="get-snps-geuvadis" class="section level3">
<h3 class="hasAnchor">
<a href="#get-snps-geuvadis" class="anchor"></a><code>get.snps.geuvadis</code>
</h3>
<p>This function obtains the Geuvadis SNP data. It downloads missing genotype data from ArrayExpress, transforms variant call format to binary files, removes SNPs with a low minor allele frequency, labels SNPs in the format “chromosome:position”, and changes sample identifiers.</p>
</div>
<div id="get-snps-bbmri" class="section level3">
<h3 class="hasAnchor">
<a href="#get-snps-bbmri" class="anchor"></a><code>get.snps.bbmri</code>
</h3>
<p>This function obtains the BBMRI SNP data. It limits the analysis to specified biobanks, reads in genotype data in chunks, removes SNPs with missing values (multiple biobanks/technologies), removes SNPs with a low minor allele frequency, and fuses data from multiple biobanks/technologies.</p>
</div>
<div id="get-exons-geuvadis" class="section level3">
<h3 class="hasAnchor">
<a href="#get-exons-geuvadis" class="anchor"></a><code>get.exons.geuvadis</code>
</h3>
<p>This function obtains the Geuvadis exon data. It retains exons on the autosomes, labels exons in the format “chromosome_start_end”, and extracts the corresponding gene names.</p>
</div>
<div id="get-exons-bbmri" class="section level3">
<h3 class="hasAnchor">
<a href="#get-exons-bbmri" class="anchor"></a><code>get.exons.bbmri</code>
</h3>
<p>This function obtains the BBMRI exon data. It loads quality controlled gene expression data, extracts sample identifiers, removes samples without SNP data, loads exon expression data, extracts sample identifiers, retains samples that passed quality control, and retains exons on the autosomes.</p>
</div>
<div id="match-samples" class="section level3">
<h3 class="hasAnchor">
<a href="#match-samples" class="anchor"></a><code>match.samples</code>
</h3>
<p>This function removes duplicate samples from each matrix, only retains samples appearing in all matrices, and brings the samples into the same order.</p>
</div>
</div>
<div id="analysis" class="section level2">
<h2 class="hasAnchor">
<a href="#analysis" class="anchor"></a>2. Analysis</h2>
<p>The <span class="math inline">\(n \times q\)</span> matrix <span class="math inline">\(\boldsymbol{Y}\)</span> represents the exons, and the <span class="math inline">\(n \times p_{chr}\)</span> matrices <span class="math inline">\(\boldsymbol{X}_{chr}\)</span> represents the SNPs, where <span class="math inline">\(chr \in \{1,\ldots,22\}\)</span>. The row names contain the sample identifiers, and the column names indicate the genomic location of the variables.</p>
<div id="adjust-samples" class="section level3">
<h3 class="hasAnchor">
<a href="#adjust-samples" class="anchor"></a><code>adjust.samples</code>
</h3>
<p>This function adjusts RNA-seq expression data for different library sizes. The <span class="math inline">\(n \times q\)</span> matrix <span class="math inline">\(\boldsymbol{Y}\)</span> contains the exon data. The library size are <span class="math inline">\(\boldsymbol{s}=(s_1,\ldots,s_n)^T\)</span>, where <span class="math inline">\(s_i=\sum_{j=1}^p Y_{ij}\)</span> for all <span class="math inline">\(i\)</span>. The mean library size is <span class="math inline">\(\bar{s}=\sum_{i=1}^n s_i / n\)</span>. We use edgeR to compute the normalisation factors <span class="math inline">\(\boldsymbol{\eta}=(\eta_1,\ldots,\eta_n)^T\)</span>. We then calculate the adjusted normalisation factors <span class="math inline">\(\boldsymbol{\gamma}=(\gamma_1,\ldots,\gamma_n)^T\)</span>, where <span class="math inline">\(\gamma_i=\eta_i*s_i / \bar{s}\)</span> for all <span class="math inline">\(i\)</span>. The adjusted value equals <span class="math inline">\(Y_{ij}/\gamma_i\)</span> for all samples <span class="math inline">\(i\)</span> and all covariates <span class="math inline">\(j\)</span>.</p>
</div>
<div id="adjust-variables" class="section level3">
<h3 class="hasAnchor">
<a href="#adjust-variables" class="anchor"></a><code>adjust.variables</code>
</h3>
<p>This function adjusts exon expression data for different exon lengths. We do this separately for each chromosome to decrease memory usage. For this adjustment, we temporarily transform matrices to vectors. An <span class="math inline">\(n \times p\)</span> matrix becomes a vector of length <span class="math inline">\(n \times p\)</span>, with the first <span class="math inline">\(p\)</span> entries corresponding to covariate <span class="math inline">\(1\)</span> and samples <span class="math inline">\(1\)</span> to <span class="math inline">\(n\)</span>, and the last <span class="math inline">\(p\)</span> entries corresponding to covariate <span class="math inline">\(p\)</span> and samples <span class="math inline">\(1\)</span> to <span class="math inline">\(n\)</span>. Let the vector <span class="math inline">\(\boldsymbol{y}=(Y_{11},\ldots,Y_{n1} \boldsymbol{,} \ldots \boldsymbol{,} Y_{1q},\ldots,Y_{nq})^T\)</span> represent exon expression. Let <span class="math inline">\(\boldsymbol{\gamma}=(\gamma_1,\ldots,\gamma_1 \boldsymbol{,} \ldots \boldsymbol{,} \gamma_q \ldots \gamma_q)^T\)</span> represent exon lengths. And let <span class="math inline">\(\boldsymbol{k}=(k_1,\ldots,k_1 \boldsymbol{,} \ldots \boldsymbol{,} k_q,\ldots,k_q)^T\)</span> represent gene names. So, <span class="math inline">\(\boldsymbol{\gamma}\)</span> and <span class="math inline">\(\boldsymbol{k}\)</span> contain <span class="math inline">\(q\)</span> blocks of <span class="math inline">\(n\)</span> equal entries. We regress <span class="math inline">\(\boldsymbol{y}\)</span> (exon expression) on a fixed effect for <span class="math inline">\(\gamma\)</span> (exon length) and a random effet for <span class="math inline">\(\boldsymbol{k}\)</span> (gene name). The residuals from this mixed model become our adjusted exon data.</p>
</div>
<div id="map-genes-map-exons-map-snps-drop-trivial" class="section level3">
<h3 class="hasAnchor">
<a href="#map-genes-map-exons-map-snps-drop-trivial" class="anchor"></a><code>map.genes</code>, <code>map.exons</code>, <code>map.snps</code>, <code>drop.trivial</code>
</h3>
<p>These functions select the variables for the spliceQTL test. First, we retrieve all protein-coding genes, excluding pseudogenes and other transcripts. Second, we attribute exons to genes, including exons within the gene. Third, we attribute SNPs to genes, including SNPs between (1) <span class="math inline">\(10\,000\)</span> base pairs before the start position of the gene, and (2) the end position of the gene. Although this might not occur in practice, exons or SNPs may be attributed to more than one gene. Finally, we exclude genes without any SNPs or with a single exon. It does not make sense to test whether these genes show alternative splicing.</p>
</div>
<div id="test-multiple" class="section level3">
<h3 class="hasAnchor">
<a href="#test-multiple" class="anchor"></a><code>test.multiple</code>
</h3>
<p>We want to test for alternative splicing along the whole genome. We do not calculate <span class="math inline">\(p\)</span>-values from an asymptotic distribution, but estimate them by permutation. If we tested a single gene, we could use a large number of permutations and obtain a precise estimate. We need at least <span class="math inline">\(21\)</span> permutations (including the identity) to reach the <span class="math inline">\(5\%\)</span> significance level. If one or two test statistics for the permuted data are larger than the one for the observed data, the estimated <span class="math inline">\(p\)</span>-value equals <span class="math inline">\(0.0476\)</span> (<span class="math inline">\(&lt;0.05\)</span>) or <span class="math inline">\(0.0952\)</span> (<span class="math inline">\(&gt;0.05\)</span>), respectively. If we test multiple genes, we will need more permutations to reach Bonferroni-significance. Using a fixed number of permutations would be too computationally expensive. This is why we invest less in genes with large <span class="math inline">\(p\)</span>-values and more in genes with small <span class="math inline">\(p\)</span>-values. For each gene, we use between <span class="math inline">\(100\)</span> and <span class="math inline">\(p/0.05+1\)</span> permutations, where <span class="math inline">\(p\)</span> is the number of genes. From <span class="math inline">\(100\)</span> permutations onwards, we repeatedly check whether two or more test statistics for the permuted data are larger than the one for the observed data. If yes, we interrupt permutation for this gene. If one or two test statistics for the permuted data are larger than the one for the observed data, the Bonferroni-adjusted estimated <span class="math inline">\(p\)</span>-value equals <span class="math inline">\(0.05*p/(p+0.05)\)</span> (<span class="math inline">\(&lt;0.05\)</span>) or <span class="math inline">\(0.1*p/(p+0.05)\)</span> (<span class="math inline">\(&gt;0.05\)</span>), respectively. These values converge to <span class="math inline">\(0.05\)</span> and <span class="math inline">\(0.1\)</span> when <span class="math inline">\(p\)</span> tends to infinity. Bonferroni-significance requires between <span class="math inline">\(8\,000\)</span> and <span class="math inline">\(60\,000\)</span> permutations on the chromosome level, depending on the number of genes, and about <span class="math inline">\(400\,000\)</span> permutations on the genome level. We therefore adjust for multiple testing for each chromosome, and not for the whole genome.</p>
</div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#data">1. Data</a></li>
<li><a href="#analysis">2. Analysis</a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p>Developed by Armin Rauschenberger, Renee Menezes.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="http://pkgdown.r-lib.org/">pkgdown</a>.</p>
</div>
</footer>
</div>
</body>
</html>
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>colasso • colasso</title>
<!-- jquery --><script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script><!-- Font Awesome icons --><link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/1.7.1/clipboard.min.js" integrity="sha384-cV+rhyOuRHc9Ub/91rihWcGmMmCXDeksTtCihMupQHSsi8GIIRDG0ThDc3HGQFJ3" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../jquery.sticky-kit.min.js"></script><script src="../pkgdown.js"></script><meta property="og:title" content="colasso">
<meta property="og:description" content="">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">colasso</a>
<span class="label label-default" data-toggle="tooltip" data-placement="bottom" title="Released package">0.0.0</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fa fa-home fa-lg"></span>
</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
Articles
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/vignette.html">colasso</a>
</li>
</ul>
</li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/rauschenberger/colasso">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>colasso</h1>
<small class="dont-index">Source: <a href="https://github.com/rauschenberger/colasso/blob/master/vignettes/vignette.Rmd"><code>vignettes/vignette.Rmd</code></a></small>
<div class="hidden name"><code>vignette.Rmd</code></div>
</div>
<!--
https://archive.ics.uci.edu/ml/datasets.html?format=&task=&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=list
https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research#Microbe
-->
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="cf">for</span>(rep <span class="cf">in</span> <span class="dv">1</span><span class="op">:</span><span class="dv">4</span>){</a>
<a class="sourceLine" id="cb1-2" data-line-number="2"> <span class="kw">set.seed</span>(rep)</a>
<a class="sourceLine" id="cb1-3" data-line-number="3"> </a>
<a class="sourceLine" id="cb1-4" data-line-number="4"></a>
<a class="sourceLine" id="cb1-5" data-line-number="5"> <span class="co">#----- OBTAIN DATA -----</span></a>
<a class="sourceLine" id="cb1-6" data-line-number="6"> </a>
<a class="sourceLine" id="cb1-7" data-line-number="7"> ### simulated data <span class="al">###</span></a>
<a class="sourceLine" id="cb1-8" data-line-number="8"> <span class="co">#set.seed(rep)</span></a>
<a class="sourceLine" id="cb1-9" data-line-number="9"> <span class="co">#n &lt;- 100; p &lt;- 1000</span></a>
<a class="sourceLine" id="cb1-10" data-line-number="10"> <span class="co">#list &lt;- colasso::colasso_simulate(p=p,n=n,cor="constant")</span></a>
<a class="sourceLine" id="cb1-11" data-line-number="11"> <span class="co">#y &lt;- list$y; X &lt;- list$X</span></a>
<a class="sourceLine" id="cb1-12" data-line-number="12"> </a>
<a class="sourceLine" id="cb1-13" data-line-number="13"> ### mice data <span class="al">###</span></a>
<a class="sourceLine" id="cb1-14" data-line-number="14"> <span class="co">#data(mice,package="BGLR")</span></a>
<a class="sourceLine" id="cb1-15" data-line-number="15"> <span class="co">#nsel &lt;- sort(sample(seq_len(1814),size=200,replace=FALSE))</span></a>
<a class="sourceLine" id="cb1-16" data-line-number="16"> <span class="co">#psel &lt;- sort(sample(seq_len(10346),size=10346,replace=FALSE))</span></a>
<a class="sourceLine" id="cb1-17" data-line-number="17"> <span class="co">#y &lt;- mice.pheno$Obesity.BMI[nsel] # try different phenotypes</span></a>
<a class="sourceLine" id="cb1-18" data-line-number="18"> <span class="co">#X &lt;- mice.X[nsel,psel]</span></a>
<a class="sourceLine" id="cb1-19" data-line-number="19"> </a>
<a class="sourceLine" id="cb1-20" data-line-number="20"> ### wheat data <span class="al">###</span></a>
<a class="sourceLine" id="cb1-21" data-line-number="21"> <span class="kw">data</span>(wheat,<span class="dt">package=</span><span class="st">"BGLR"</span>)</a>
<a class="sourceLine" id="cb1-22" data-line-number="22"> nsel &lt;-<span class="st"> </span><span class="kw">seq_len</span>(<span class="dv">599</span>) <span class="co"># sort(sample(seq_len(599),size=200,replace=FALSE))</span></a>
<a class="sourceLine" id="cb1-23" data-line-number="23"> psel &lt;-<span class="st"> </span><span class="kw">seq_len</span>(<span class="dv">1279</span>) <span class="co"># sort(sample(seq_len(1279),size=200,replace=FALSE))</span></a>
<a class="sourceLine" id="cb1-24" data-line-number="24"> y &lt;-<span class="st"> </span><span class="kw">as.numeric</span>(wheat.Y[nsel,rep]) <span class="co"># try different phenotypes</span></a>
<a class="sourceLine" id="cb1-25" data-line-number="25"> X &lt;-<span class="st"> </span>wheat.X[nsel,psel]</a>
<a class="sourceLine" id="cb1-26" data-line-number="26"> </a>
<a class="sourceLine" id="cb1-27" data-line-number="27"> <span class="co">#----- CROSS-VALIDATE -----</span></a>
<a class="sourceLine" id="cb1-28" data-line-number="28"> </a>
<a class="sourceLine" id="cb1-29" data-line-number="29"> loss &lt;-<span class="st"> </span><span class="kw"><a href="../reference/colasso_compare.html">colasso_compare</a></span>(<span class="dt">y=</span>y,<span class="dt">X=</span>X)</a>
<a class="sourceLine" id="cb1-30" data-line-number="30"> </a>
<a class="sourceLine" id="cb1-31" data-line-number="31"> <span class="co"># fold &lt;- sample(x=rep(x=seq_len(5),length.out=length(y)))</span></a>
<a class="sourceLine" id="cb1-32" data-line-number="32"> <span class="co"># pred &lt;- matrix(data=NA,nrow=length(y),ncol=8)</span></a>
<a class="sourceLine" id="cb1-33" data-line-number="33"> <span class="co"># for(i in sort(unique(fold))){</span></a>
<a class="sourceLine" id="cb1-34" data-line-number="34"> <span class="co"># cat("i =",i,"\n")</span></a>
<a class="sourceLine" id="cb1-35" data-line-number="35"> <span class="co"># fit &lt;- colasso(y=y[fold!=i],X=X[fold!=i,],alpha=1) # increase nfold? us</span></a>
<a class="sourceLine" id="cb1-36" data-line-number="36"> <span class="co"># # MEM[[i]] &lt;- fit # trial</span></a>
<a class="sourceLine" id="cb1-37" data-line-number="37"> <span class="co"># </span></a>
<a class="sourceLine" id="cb1-38" data-line-number="38"> <span class="co"># for(j in seq_along(fit)){</span></a>
<a class="sourceLine" id="cb1-39" data-line-number="39"> <span class="co"># pred[fold==i,j] &lt;- glmnet::predict.glmnet(object=fit[[j]],</span></a>
<a class="sourceLine" id="cb1-40" data-line-number="40"> <span class="co"># newx=X[fold==i,],</span></a>
<a class="sourceLine" id="cb1-41" data-line-number="41"> <span class="co"># s=fit[[j]]$lambda.min,</span></a>
<a class="sourceLine" id="cb1-42" data-line-number="42"> <span class="co"># type="response")</span></a>
<a class="sourceLine" id="cb1-43" data-line-number="43"> <span class="co"># }</span></a>
<a class="sourceLine" id="cb1-44" data-line-number="44"> <span class="co"># pred[fold==i,8] &lt;- mean(y[fold!=i]) # intercept-only model</span></a>
<a class="sourceLine" id="cb1-45" data-line-number="45"> <span class="co"># }</span></a>
<a class="sourceLine" id="cb1-46" data-line-number="46"> <span class="co"># loss &lt;- apply(X=pred,MARGIN=2,FUN=function(x) sum((y-x)^2))</span></a>
<a class="sourceLine" id="cb1-47" data-line-number="47"> </a>
<a class="sourceLine" id="cb1-48" data-line-number="48"> <span class="co">#graphics::par(mar=c(3,3,1,1))</span></a>
<a class="sourceLine" id="cb1-49" data-line-number="49"> <span class="co">#col &lt;- rep(x=0,times=length(loss)-1)</span></a>
<a class="sourceLine" id="cb1-50" data-line-number="50"> <span class="co">#col[1] &lt;- col[length(col)] &lt;- 1</span></a>
<a class="sourceLine" id="cb1-51" data-line-number="51"> <span class="co">#plot(y=loss[-length(loss)],x=seq_len(length(loss)-1),</span></a>
<a class="sourceLine" id="cb1-52" data-line-number="52"> <span class="co"># col=col+1,pch=col) # ,ylim=range(loss))</span></a>
<a class="sourceLine" id="cb1-53" data-line-number="53"> <span class="co">#abline(v=c(1.5,length(loss)-1.5),lty=2)</span></a>
<a class="sourceLine" id="cb1-54" data-line-number="54"> <span class="co">#graphics::grid()</span></a>
<a class="sourceLine" id="cb1-55" data-line-number="55"> <span class="co">#abline(h=loss[length(loss)],lty=2,col="red")</span></a>
<a class="sourceLine" id="cb1-56" data-line-number="56"> </a>
<a class="sourceLine" id="cb1-57" data-line-number="57">}</a></code></pre></div>
<div id="bbmri-data-important" class="section level1">
<h1 class="hasAnchor">
<a href="#bbmri-data-important" class="anchor"></a>BBMRI DATA (important!)</h1>
<p>Repeat this for all normally distributed responses, omit samples with missing response, save results to file.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" data-line-number="1">utils<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/utils/topics/data">data</a></span>(metabolomics_RP3RP4_overlap,<span class="dt">package=</span><span class="st">"BBMRIomics"</span>)</a>
<a class="sourceLine" id="cb2-2" data-line-number="2">utils<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/utils/topics/data">data</a></span>(rnaSeqData_ReadCounts_BIOS_cleaned,<span class="dt">package=</span><span class="st">"BBMRIomics"</span>)</a>
<a class="sourceLine" id="cb2-3" data-line-number="3"></a>
<a class="sourceLine" id="cb2-4" data-line-number="4">samples &lt;-<span class="st"> </span><span class="kw">intersect</span>(<span class="kw">colnames</span>(counts),<span class="kw">colnames</span>(metabolomicData))</a>
<a class="sourceLine" id="cb2-5" data-line-number="5"></a>
<a class="sourceLine" id="cb2-6" data-line-number="6">Y &lt;-<span class="st"> </span><span class="kw">t</span>(SummarizedExperiment<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/SummarizedExperiment/topics/SummarizedExperiment-class">assays</a></span>(metabolomicData[,samples])<span class="op">$</span>measurements)</a>
<a class="sourceLine" id="cb2-7" data-line-number="7">X &lt;-<span class="st"> </span><span class="kw">t</span>(SummarizedExperiment<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/SummarizedExperiment/topics/SummarizedExperiment-class">assay</a></span>(counts[,samples]))</a>
<a class="sourceLine" id="cb2-8" data-line-number="8"></a>
<a class="sourceLine" id="cb2-9" data-line-number="9">loss &lt;-<span class="st"> </span><span class="ot">NULL</span></a>
<a class="sourceLine" id="cb2-10" data-line-number="10"><span class="cf">for</span>(j <span class="cf">in</span> <span class="kw">seq_len</span>(<span class="kw">ncol</span>(traits))){</a>
<a class="sourceLine" id="cb2-11" data-line-number="11"> y &lt;-<span class="st"> </span>Y[,j]</a>
<a class="sourceLine" id="cb2-12" data-line-number="12"> <span class="cf">if</span>(<span class="kw">sd</span>(y,<span class="dt">na.rm=</span><span class="ot">TRUE</span>)<span class="op">==</span><span class="dv">0</span>){<span class="cf">next</span>}<span class="er">)</span></a>
<a class="sourceLine" id="cb2-13" data-line-number="13"> cond &lt;-<span class="st"> </span><span class="op">!</span><span class="kw">is.na</span>(y)</a>
<a class="sourceLine" id="cb2-14" data-line-number="14"> y &lt;-<span class="st"> </span>y[cond]</a>
<a class="sourceLine" id="cb2-15" data-line-number="15"> X &lt;-<span class="st"> </span>X[cond,]</a>
<a class="sourceLine" id="cb2-16" data-line-number="16"> loss &lt;-<span class="st"> </span><span class="kw">rbind</span>(loss,<span class="kw"><a href="../reference/colasso.html">colasso</a></span>(<span class="dt">y=</span>y,<span class="dt">X=</span>X))</a>
<a class="sourceLine" id="cb2-17" data-line-number="17">}</a>
<a class="sourceLine" id="cb2-18" data-line-number="18"></a>
<a class="sourceLine" id="cb2-19" data-line-number="19">min &lt;-<span class="st"> </span><span class="kw">apply</span>(traits,<span class="dv">2</span>,<span class="cf">function</span>(x) <span class="kw">min</span>(x,<span class="dt">na.rm=</span><span class="ot">TRUE</span>))</a>
<a class="sourceLine" id="cb2-20" data-line-number="20">max &lt;-<span class="st"> </span><span class="kw">apply</span>(traits,<span class="dv">2</span>,<span class="cf">function</span>(x) <span class="kw">max</span>(x,<span class="dt">na.rm=</span><span class="ot">TRUE</span>))</a>
<a class="sourceLine" id="cb2-21" data-line-number="21">var &lt;-<span class="st"> </span><span class="kw">apply</span>(traits,<span class="dv">2</span>,<span class="cf">function</span>(x) <span class="kw">var</span>(x,<span class="dt">na.rm=</span><span class="ot">TRUE</span>))</a>
<a class="sourceLine" id="cb2-22" data-line-number="22"></a>
<a class="sourceLine" id="cb2-23" data-line-number="23"></a>
<a class="sourceLine" id="cb2-24" data-line-number="24"></a>
<a class="sourceLine" id="cb2-25" data-line-number="25">psel &lt;-<span class="st"> </span><span class="kw">sample</span>(<span class="kw">seq_len</span>(<span class="dv">56515</span>),<span class="dt">size=</span><span class="dv">2000</span>)</a>
<a class="sourceLine" id="cb2-26" data-line-number="26">nsel &lt;-<span class="st"> </span><span class="kw">sample</span>(<span class="kw">seq_len</span>(<span class="dv">2003</span>),<span class="dt">size=</span><span class="dv">500</span>)</a>
<a class="sourceLine" id="cb2-27" data-line-number="27"></a>
<a class="sourceLine" id="cb2-28" data-line-number="28">y &lt;-<span class="st"> </span>YY[nsel,<span class="st">"totfa"</span>]</a>
<a class="sourceLine" id="cb2-29" data-line-number="29">X &lt;-<span class="st"> </span>XX[nsel,psel]</a>
<a class="sourceLine" id="cb2-30" data-line-number="30"></a>
<a class="sourceLine" id="cb2-31" data-line-number="31">net &lt;-<span class="st"> </span>glmnet<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/glmnet/topics/cv.glmnet">cv.glmnet</a></span>(<span class="dt">x=</span>X,<span class="dt">y=</span>y)</a>
<a class="sourceLine" id="cb2-32" data-line-number="32"></a>
<a class="sourceLine" id="cb2-33" data-line-number="33"><span class="kw">plot</span>(<span class="dt">x=</span>net<span class="op">$</span>lambda,<span class="dt">y=</span>net<span class="op">$</span>cvm)</a>
<a class="sourceLine" id="cb2-34" data-line-number="34"></a>
<a class="sourceLine" id="cb2-35" data-line-number="35"><span class="co"># then apply colasso function !</span></a></code></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" data-line-number="1">n &lt;-<span class="st"> </span><span class="dv">100</span>; p &lt;-<span class="st"> </span><span class="dv">10</span></a>
<a class="sourceLine" id="cb3-2" data-line-number="2">x &lt;-<span class="st"> </span><span class="kw">matrix</span>(<span class="kw">rnorm</span>(n<span class="op">*</span>p),<span class="dt">nrow=</span>n,<span class="dt">ncol=</span>p)</a>
<a class="sourceLine" id="cb3-3" data-line-number="3">y &lt;-<span class="st"> </span><span class="kw">rbinom</span>(<span class="dt">n=</span>n,<span class="dt">size=</span><span class="dv">1</span>,<span class="dt">prob=</span><span class="fl">0.2</span>)</a>
<a class="sourceLine" id="cb3-4" data-line-number="4">a &lt;-<span class="st"> </span>stats<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/stats/topics/glm">glm</a></span>(y<span class="op">~</span>x,<span class="dt">family=</span><span class="st">"binomial"</span>)</a>
<a class="sourceLine" id="cb3-5" data-line-number="5">y &lt;-<span class="st"> </span><span class="kw">log</span>(y<span class="op">/</span>(<span class="dv">1</span><span class="op">-</span>y))</a>
<a class="sourceLine" id="cb3-6" data-line-number="6">y[y<span class="op">==-</span><span class="ot">Inf</span>] &lt;-<span class="st"> </span><span class="fl">-99e99</span></a>
<a class="sourceLine" id="cb3-7" data-line-number="7">y[y<span class="op">==</span><span class="ot">Inf</span>] &lt;-<span class="st"> </span><span class="fl">99e99</span></a>
<a class="sourceLine" id="cb3-8" data-line-number="8">b &lt;-<span class="st"> </span>stats<span class="op">::</span><span class="kw"><a href="http://www.rdocumentation.org/packages/stats/topics/glm">glm</a></span>(y<span class="op">~</span>x,<span class="dt">family=</span><span class="st">"gaussian"</span>)</a></code></pre></div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#bbmri-data-important">BBMRI DATA (important!)</a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p>Developed by Armin Rauschenberger.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="http://pkgdown.r-lib.org/">pkgdown</a>.</p>
</div>
</footer>
</div>
</body>
</html>
......@@ -6,7 +6,7 @@
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Authors • spliceQTL</title>
<title>Authors • colasso</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
......@@ -54,7 +54,7 @@
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class=