Installation

 Armin Rauschenberger committed Jul 31, 2019 92 

Install the current release from CRAN:

 Armin Rauschenberger committed Jun 05, 2020 93   Armin Rauschenberger committed Aug 07, 2019 94 

Or install the latest development version from GitHub:

 Armin Rauschenberger committed Jun 05, 2020 95 96 
#install.packages("devtools") devtools::install_github("rauschenberger/joinet")
 Armin Rauschenberger committed Aug 07, 2019 97 98 99 100 101 

Initialisation

 Armin Rauschenberger committed Jun 05, 2020 102 
library(joinet)
 Armin Rauschenberger committed Aug 07, 2019 103 

And access the documentation:

 Armin Rauschenberger committed Jun 05, 2020 104 105 106 
?joinet help(joinet) browseVignettes("joinet")
 Armin Rauschenberger committed Aug 07, 2019 107 108 109 110 111 

Simulation

For n samples, we simulate p inputs (features, covariates) and q outputs (outcomes, responses). We assume high-dimensional inputs (p $$\gg$$ n) and low-dimensional outputs (q $$\ll$$ n).

 Armin Rauschenberger committed Jun 05, 2020 112 113 114 
n <- 100 q <- 2 p <- 500
 Armin Rauschenberger committed Aug 07, 2019 115 

We simulate the p inputs from a multivariate normal distribution. For the mean, we use the p-dimensional vector mu, where all elements equal zero. For the covariance, we use the p $$\times$$ p matrix Sigma, where the entry in row $$i$$ and column $$j$$ equals rho$$^{|i-j|}$$. The parameter rho determines the strength of the correlation among the inputs, with small rho leading weak correlations, and large rho leading to strong correlations (0 < rho < 1). The input matrix X has n rows and p columns.

 Armin Rauschenberger committed Jun 05, 2020 116 117 118 119 
mu <- rep(0,times=p) rho <- 0.90 Sigma <- rho^abs(col(diag(p))-row(diag(p))) X <- MASS::mvrnorm(n=n,mu=mu,Sigma=Sigma)
 Armin Rauschenberger committed Aug 07, 2019 120 

We simulate the input-output effects from independent Bernoulli distributions. The parameter pi determines the number of effects, with small pi leading to few effects, and large pi leading to many effects (0 < pi < 1). The scalar alpha represents the intercept, and the p-dimensional vector beta represents the slopes.

 Armin Rauschenberger committed Jun 05, 2020 121 122 123 
pi <- 0.01 alpha <- 0 beta <- rbinom(n=p,size=1,prob=pi)
 Armin Rauschenberger committed Aug 07, 2019 124 

From the intercept alpha, the slopes beta and the inputs X, we calculate the linear predictor, the n-dimensional vector eta. Rescale the linear predictor to make the effects weaker or stronger. Set the argument family to "gaussian", "binomial", or "poisson" to define the distribution. The n times p matrix Y represents the outputs. We assume the outcomes are positively correlated.

 Armin Rauschenberger committed Jun 05, 2020 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 
eta <- alpha + X %*% beta eta <- 1.5*scale(eta) family <- "gaussian"  if(family=="gaussian"){   mean <- eta   Y <- replicate(n=q,expr=rnorm(n=n,mean=mean)) }  if(family=="binomial"){   prob <- 1/(1+exp(-eta))   Y <- replicate(n=q,expr=rbinom(n=n,size=1,prob=prob)) }  if(family=="poisson"){   lambda <- exp(eta)   Y <- replicate(n=q,expr=rpois(n=n,lambda=lambda)) }  cor(Y)
 Armin Rauschenberger committed Aug 07, 2019 145 146 147 148 149 

Application

The function joinet fits univariate and multivariate regression. Set the argument alpha.base to 0 (ridge) or 1 (lasso).

 Armin Rauschenberger committed Jun 05, 2020 150 
object <- joinet(Y=Y,X=X,family=family)
 Armin Rauschenberger committed Aug 07, 2019 151 

Standard methods are available. The function predict returns the predicted values from the univariate (base) and multivariate (meta) models. The function coef returns the estimated intercepts (alpha) and slopes (beta) from the multivariate model (input-output effects). And the function weights returns the weights from stacking (output-output effects).

 Armin Rauschenberger committed Jun 05, 2020 152 153 154 155 156 
predict(object,newx=X)  coef(object)  weights(object)
 Armin Rauschenberger committed Aug 07, 2019 157 

The function cv.joinet compares the predictive performance of univariate (base) and multivariate (meta) regression by nested cross-validation. The argument type.measure determines the loss function.

 Armin Rauschenberger committed Jun 05, 2020 158 
cv.joinet(Y=Y,X=X,family=family)
##          [,1]     [,2] ## base 1.204741 1.523550 ## meta 1.161487 1.283678 ## none 3.206394 3.495571

Reference

 Armin Rauschenberger committed May 07, 2020 167 168 

Armin Rauschenberger and Enrico Glaab (2020). “joinet: predicting correlated outcomes jointly to improve clinical prognosis”. Manuscript in preparation.

