Showing 101 - 110 of 2,390
Outlier detection in high-dimensional datasets poses new challenges that have not been investigated in the literature. In this paper, we present an integrated methodology for the identification of outliers which is suitable for datasets with higher number of variables than observations. Our...
Persistent link: https://www.econbiz.de/10011916875
Abstract In the causal adjustment setting, variable selection techniques based only on the outcome or only on the treatment allocation model can result in the omission of confounders and hence may lead to bias, or the inclusion of spurious variables and hence cause variance inflation, in...
Persistent link: https://www.econbiz.de/10014610859
Abstract Often the research interest in causal inference is on the regression causal effect, which is the mean difference in the potential outcomes conditional on the covariates. In this paper, we use sufficient dimension reduction to estimate a lower dimensional linear combination of the...
Persistent link: https://www.econbiz.de/10014610874
Variable selection is a difficult problem in statistical model building. Identification of cost efficient diagnostic factors is very important to health researchers, but most variable selection methods do not take into account the cost of collecting data for the predictors. The trade off between...
Persistent link: https://www.econbiz.de/10009447237
In additive models the problem of variable selection is strongly linked to the choice of the amount of smoothing used for components that represent metrical variables. Many software packages use separate toolsto solve the different tasks of variable selection and smoothing parameter choice. The...
Persistent link: https://www.econbiz.de/10010266175
A new regularization method for regression models is proposed. The criterion to be minimized contains a penalty term which explicitly links strength of penalization to the correlation between predictors. As the elastic net, the method encourages a grouping effect where strongly correlated...
Persistent link: https://www.econbiz.de/10010266210
The use of generalized additive models in statistical data analysis suffers from the restriction to few explanatory variables and the problems of selection of smoothing parameters. Generalized additive model boosting circumvents these problems by means of stagewise fitting of weak learners. A...
Persistent link: https://www.econbiz.de/10010266217
We address the problem of maximally selected chi-square statistics in the case of a binary Y variable and a nominal X variable with several categories. The distribution of the maximally selected chi-square statistic has already been derived when the best cutpoint is chosen from a continuous or...
Persistent link: https://www.econbiz.de/10010266224
Gene expression datasets usually have thousends of explanatory variables which are observed on only few samples. Generally most variables of a dataset have no effect and one is interested in eliminating these irrelevant variables. In order to obtain a subset of relevant variables an appropriate...
Persistent link: https://www.econbiz.de/10010266252
Specifying a prior distribution for the large number of parameters in the linear statistical model is a difficult step in the Bayesian approach to the design and analysis of experiments. Here we address this difficulty by proposing the use of functional priors and then by working out important...
Persistent link: https://www.econbiz.de/10009475773