R Lab - Part II
Association modeling
In this part of the exercise, we model (on the log-scale) the association of miRNA espression on protein expression adjusting for the corresponding mRNA.
Investigate miR-107 and B-RAF (Aure et al, 2015, Figure 2H)
= prt[12,]
prt.BRAF = rna[12,]
rna.BRAF .107 = mir[16,] mir
(a) Linear regression model
on the log-scale, Aure et al. 2015, equation (3)
<- lm(prt.BRAF ~ mir.107 + rna.BRAF)
fitA summary(fitA)
Call:
lm(formula = prt.BRAF ~ mir.107 + rna.BRAF)
Residuals:
Min 1Q Median 3Q Max
-2.3028 -0.6126 0.0453 0.6153 3.1361
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.315e-08 5.068e-02 0.000 1
mir.107 4.324e-01 5.079e-02 8.513 1.06e-15 ***
rna.BRAF 3.200e-01 5.079e-02 6.301 1.15e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.851 on 279 degrees of freedom
Multiple R-squared: 0.281, Adjusted R-squared: 0.2758
F-statistic: 54.51 on 2 and 279 DF, p-value: < 2.2e-16
Add smooth non-linear cures to the scatterplots: use existing panel.smooth()
function, and add linear regression lines to the scatterplots:
<- function (x, y, col.regres = "blue", ...)
panel.linear
{ points(x, y, pch=19)
<- is.finite(x) & is.finite(y)
ok if (any(ok))
abline(stats::lm(y[ok] ~ x[ok]), col = col.regres, ...)
}
pairs(data.frame(mir.107, prt.BRAF, rna.BRAF),
lower.panel = panel.smooth,
upper.panel = panel.linear)
(b) Lasso-penalised linear model
with all miRNAs (Aure et al. 2015, equation (4))
library(glmnet)
# 10-fold CV to determine the optimal lambda:
# Note: rna.BRAF is penalised together with all the mir variables.
# You can use the penalty.factor option to avoid this.
set.seed(1234)
<- cv.glmnet(y=prt.BRAF, x=t(rbind(mir, rna.BRAF)),
cvfit alpha=1, nfolds=10, standardize=TRUE)
par(mfrow=c(1,1))
plot(cvfit)
<- cvfit$lambda.min
lambda.opt
# Coefficient path plot and coefficients for optimal lambda:
<- cvfit$glmnet.fit
fitB
plot(fitB, xvar="lambda")
abline(v=log(lambda.opt))
coef(fitB, s=lambda.opt)
predict(fitB, type="nonzero", s=lambda.opt)
Compare the regression coefficient of mir.107 from the models in (a) and (b):
coef(fitA)["mir.107"]
as.matrix(coef(fitB, s=cvfit$lambda.min))["hsa-miR-107",]