joinet.html 18.9 KB
 Armin Rauschenberger committed Oct 02, 2020 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  Multivariate Elastic Net Regression • joinet
 Armin Rauschenberger committed Jun 16, 2021 79 
 Armin Rauschenberger committed Oct 02, 2020 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 

Installation

Install the current release from CRAN:

 Armin Rauschenberger committed Jun 16, 2021 96 97 
install.packages("joinet")
 Armin Rauschenberger committed Oct 02, 2020 98 

Or install the latest development version from GitHub:

 Armin Rauschenberger committed Jun 16, 2021 99 100 101 
#install.packages("devtools") devtools::install_github("rauschenberger/joinet")
 Armin Rauschenberger committed Oct 02, 2020 102 103 104 105 106 

Initialisation

 Armin Rauschenberger committed Jun 16, 2021 107 108   Armin Rauschenberger committed Oct 02, 2020 109 

And access the documentation:

 Armin Rauschenberger committed Jun 16, 2021 110 111 112 113 
?joinet help(joinet) browseVignettes("joinet")
 Armin Rauschenberger committed Oct 02, 2020 114 115 116 117 118 

Simulation

For n samples, we simulate p inputs (features, covariates) and q outputs (outcomes, responses). We assume high-dimensional inputs (p $$\gg$$ n) and low-dimensional outputs (q $$\ll$$ n).

 Armin Rauschenberger committed Jun 16, 2021 119 120 121 122 
n <- 100 q <- 2 p <- 500
 Armin Rauschenberger committed Oct 02, 2020 123 

We simulate the p inputs from a multivariate normal distribution. For the mean, we use the p-dimensional vector mu, where all elements equal zero. For the covariance, we use the p $$\times$$ p matrix Sigma, where the entry in row $$i$$ and column $$j$$ equals rho$$^{|i-j|}$$. The parameter rho determines the strength of the correlation among the inputs, with small rho leading weak correlations, and large rho leading to strong correlations (0 < rho < 1). The input matrix X has n rows and p columns.

 Armin Rauschenberger committed Jun 16, 2021 124 125 126 127 128 
mu <- rep(0,times=p) rho <- 0.90 Sigma <- rho^abs(col(diag(p))-row(diag(p))) X <- MASS::mvrnorm(n=n,mu=mu,Sigma=Sigma)
 Armin Rauschenberger committed Oct 02, 2020 129 

We simulate the input-output effects from independent Bernoulli distributions. The parameter pi determines the number of effects, with small pi leading to few effects, and large pi leading to many effects (0 < pi < 1). The scalar alpha represents the intercept, and the p-dimensional vector beta represents the slopes.

 Armin Rauschenberger committed Jun 16, 2021 130 131 132 133 
pi <- 0.01 alpha <- 0 beta <- rbinom(n=p,size=1,prob=pi)
 Armin Rauschenberger committed Oct 02, 2020 134 

From the intercept alpha, the slopes beta and the inputs X, we calculate the linear predictor, the n-dimensional vector eta. Rescale the linear predictor to make the effects weaker or stronger. Set the argument family to "gaussian", "binomial", or "poisson" to define the distribution. The n times p matrix Y represents the outputs. We assume the outcomes are positively correlated.

 Armin Rauschenberger committed Jun 16, 2021 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 
eta <- alpha + X %*% beta eta <- 1.5*scale(eta) family <- "gaussian"  if(family=="gaussian"){   mean <- eta   Y <- replicate(n=q,expr=rnorm(n=n,mean=mean)) }  if(family=="binomial"){   prob <- 1/(1+exp(-eta))   Y <- replicate(n=q,expr=rbinom(n=n,size=1,prob=prob)) }  if(family=="poisson"){   lambda <- exp(eta)   Y <- replicate(n=q,expr=rpois(n=n,lambda=lambda)) }  cor(Y)
 Armin Rauschenberger committed Oct 02, 2020 156 157 158 159 160 

Application

The function joinet fits univariate and multivariate regression. Set the argument alpha.base to 0 (ridge) or 1 (lasso).

 Armin Rauschenberger committed Jun 16, 2021 161 162 
object <- joinet(Y=Y,X=X,family=family)
 Armin Rauschenberger committed Oct 02, 2020 163 

Standard methods are available. The function predict returns the predicted values from the univariate (base) and multivariate (meta) models. The function coef returns the estimated intercepts (alpha) and slopes (beta) from the multivariate model (input-output effects). And the function weights returns the weights from stacking (output-output effects).

 Armin Rauschenberger committed Jun 16, 2021 164 165 
predict(object,newx=X) 
Armin Rauschenberger committed Oct 02, 2020  166                                                                                                                                                                                                                                                         
Armin Rauschenberger committed Jun 16, 2021  167                                                                                                                                                                                                                                                        coef(object) 
Armin Rauschenberger committed Oct 02, 2020  168                                                                                                                                                                                                                                                         
Armin Rauschenberger committed Jun 16, 2021  169                                                                                                                                                                                                                                                        weights(object)
 Armin Rauschenberger committed Oct 02, 2020 170 

The function cv.joinet compares the predictive performance of univariate (base) and multivariate (meta) regression by nested cross-validation. The argument type.measure determines the loss function.

 Armin Rauschenberger committed Jun 16, 2021 171 172 
cv.joinet(Y=Y,X=X,family=family)
 Armin Rauschenberger committed Oct 02, 2020 173 174 175 176 177 178 179 180 181 
##           [,1]     [,2] ## base  1.204741 1.523550 ## meta  1.185200 1.278125 ## FALSE       NA       NA ## none  3.206394 3.495571

Reference

 Armin Rauschenberger committed Jun 16, 2021 182 

Armin Rauschenberger and Enrico Glaab (2021). “Predicting correlated outcomes from molecular data”. Manuscript in preparation.

 Armin Rauschenberger committed Oct 02, 2020 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200