Variable importance in matched case–control studies in settings of high dimensional data
type="main" xml:id="rssc12056-abs-0001"> <title type="main">Summary</title> <p>We propose a method for assessing variable importance in matched case–control investigations and other highly stratified studies characterized by high dimensional data (p>>n). In simulated and real data sets, we show that the algorithm proposed performs better than a conventional univariate method (conditional logistic regression) and a popular multivariable algorithm (random forests) that does not take the matching into account. The methods are applicable to wide ranging, high impact clinical studies including metabolomic, proteomic studies and neuroimaging analyses, such as those assessing stroke and Alzheimer's disease. The methods proposed have been implemented in a freely available R library (<span cssStyle="font-family:monospace"><url href="http://cran.r-project.org/web/packages/RPCLR/index.html">http://cran .r-project.org/web/packages/RPCLR/index.html</url></span>).
Year of publication: |
2014
|
---|---|
Authors: | Balasubramanian, Raji ; Houseman, E. Andres ; Coull, Brent A. ; Lev, Michael H. ; Schwamm, Lee H. ; Betensky, Rebecca A. |
Published in: |
Journal of the Royal Statistical Society Series C. - Royal Statistical Society - RSS, ISSN 0035-9254. - Vol. 63.2014, 4, p. 639-655
|
Publisher: |
Royal Statistical Society - RSS |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Qian, Jing, (2014)
-
A computationally tractable multivariate random effects model for clustered binary data
Coull, Brent A., (2006)
-
Houseman, E. Andres, (2006)
- More ...