<p>We simulate the <code>p</code> inputs from a multivariate normal distribution. For the mean, we use the <code>p</code>-dimensional vector <code>mu</code>, where all elements equal zero. For the covariance, we use the <code>p</code><spanclass="math inline">\(\times\)</span><code>p</code> matrix <code>Sigma</code>, where the entry in row <spanclass="math inline">\(i\)</span> and column <spanclass="math inline">\(j\)</span> equals <code>rho</code><spanclass="math inline">\(^{|i-j|}\)</span>. The parameter <code>rho</code> determines the strength of the correlation among the inputs, with small <code>rho</code> leading weak correlations, and large <code>rho</code> leading to strong correlations (0 <<code>rho</code>< 1). The input matrix <code>X</code> has <code>n</code> rows and <code>p</code> columns.</p>
<p>We simulate the input-output effects from independent Bernoulli distributions. The parameter <code>pi</code> determines the number of effects, with small <code>pi</code> leading to few effects, and large <code>pi</code> leading to many effects (0 <<code>pi</code>< 1). The scalar <code>alpha</code> represents the intercept, and the <code>p</code>-dimensional vector <code>beta</code> represents the slopes.</p>
<p>From the intercept <code>alpha</code>, the slopes <code>beta</code> and the inputs <code>X</code>, we calculate the linear predictor, the <code>n</code>-dimensional vector <code>eta</code>. Rescale the linear predictor to make the effects weaker or stronger. Set the argument <code>family</code> to <code>"gaussian"</code>, <code>"binomial"</code>, or <code>"poisson"</code> to define the distribution. The <code>n</code> times <code>p</code> matrix <code>Y</code> represents the outputs. We assume the outcomes are <em>positively</em> correlated.</p>
<aclass="sourceLine"id="cb8-6"title="6"> mean <-<spanclass="st"></span>eta</a>
<aclass="sourceLine"id="cb8-7"title="7"> Y <-<spanclass="st"></span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/base/topics/lapply">replicate</a></span>(<spanclass="dt">n=</span>q,<spanclass="dt">expr=</span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/stats/topics/Normal">rnorm</a></span>(<spanclass="dt">n=</span>n,<spanclass="dt">mean=</span>mean))</a>
<aclass="sourceLine"id="cb8-12"title="12"> Y <-<spanclass="st"></span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/base/topics/lapply">replicate</a></span>(<spanclass="dt">n=</span>q,<spanclass="dt">expr=</span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/stats/topics/Binomial">rbinom</a></span>(<spanclass="dt">n=</span>n,<spanclass="dt">size=</span><spanclass="dv">1</span>,<spanclass="dt">prob=</span>prob))</a>
<aclass="sourceLine"id="cb8-17"title="17"> Y <-<spanclass="st"></span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/base/topics/lapply">replicate</a></span>(<spanclass="dt">n=</span>q,<spanclass="dt">expr=</span><spanclass="kw"><ahref="https://www.rdocumentation.org/packages/stats/topics/Poisson">rpois</a></span>(<spanclass="dt">n=</span>n,<spanclass="dt">lambda=</span>lambda))</a>
<p>The function <code>joinet</code> fits univariate and multivariate regression. Set the argument <code>alpha.base</code> to 0 (ridge) or 1 (lasso).</p>
<p>Standard methods are available. The function <code>predict</code> returns the predicted values from the univariate (<code>base</code>) and multivariate (<code>meta</code>) models. The function <code>coef</code> returns the estimated intercepts (<code>alpha</code>) and slopes (<code>beta</code>) from the multivariate model (input-output effects). And the function <code>weights</code> returns the weights from stacking (output-output effects).</p>
<p>The function <code>cv.joinet</code> compares the predictive performance of univariate (<code>base</code>) and multivariate (<code>meta</code>) regression by nested cross-validation. The argument <code>type.measure</code> determines the loss function.</p>
<p>Multivariate Elastic Net Regression (extending <ahref="https://CRAN.R-project.org/package=glmnet">glmnet</a>).</p>
<p>Multivariate Elastic Net Regression (extending the <ahref="https://cran.r-project.org">R</a> package <ahref="https://CRAN.R-project.org/package=glmnet">glmnet</a>).</p>
<divclass='dont-index'><p>Type <code><ahref='https://www.rdocumentation.org/packages/utils/topics/browseVignettes'>browseVignettes("joinet")</a></code> for examples.</p></div>
@@ -21,20 +22,111 @@ Install the current release from [CRAN](https://CRAN.R-project.org/package=joine
install.packages("joinet")
```
or the latest development version from [GitHub](https://github.com/rauschenberger/joinet):
Or install the latest development version from [GitHub](https://github.com/rauschenberger/joinet):
```{r,eval=FALSE}
#install.packages("devtools")
devtools::install_github("rauschenberger/joinet")
```
Access the help pages with:
## Initialisation
Load and attach the package:
```{r}
library(joinet)
```
And access the [documentation](https://rauschenberger.github.io/joinet/):
```{r,eval=FALSE}
?joinet
help(joinet)
browseVignettes("joinet")
```
## Simulation
For `n` samples, we simulate `p` inputs (features, covariates) and `q` outputs (outcomes, responses). We assume high-dimensional inputs (`p` $\gg$ `n`) and low-dimensional outputs (`q` $\ll$ `n`).
```{r}
n <- 100
q <- 2
p <- 500
```
We simulate the `p` inputs from a multivariate normal distribution. For the mean, we use the `p`-dimensional vector `mu`, where all elements equal zero. For the covariance, we use the `p` $\times$ `p` matrix `Sigma`, where the entry in row $i$ and column $j$ equals `rho`$^{|i-j|}$. The parameter `rho` determines the strength of the correlation among the inputs, with small `rho` leading weak correlations, and large `rho` leading to strong correlations (0 < `rho` < 1). The input matrix `X` has `n` rows and `p` columns.
```{r}
mu <- rep(0,times=p)
rho <- 0.90
Sigma <- rho^abs(col(diag(p))-row(diag(p)))
X <- MASS::mvrnorm(n=n,mu=mu,Sigma=Sigma)
```
We simulate the input-output effects from independent Bernoulli distributions. The parameter `pi` determines the number of effects, with small `pi` leading to few effects, and large `pi` leading to many effects (0 < `pi` < 1). The scalar `alpha` represents the intercept, and the `p`-dimensional vector `beta` represents the slopes.