An Algorithm for Creating Models for Imputation Using the MICE Approach: An Application in Stata

It is generally advised that imputation models contain as many “predictor” variables as possible, since the greater the number of variables the greater the amount of information from which to make estimations (van Buuren, Boshuizen & Knook 1999). Ideally, an imputation model might contain all variables in the dataset. Hence, the default in software packages that perform multivariate imputation by chained equations (e.g. ice in Stata) is often to use all other variables in the imputation model to predict missing values. However, in datasets with moderate to large numbers of variables, attempting to use all other variables in the dataset results in imputation models that are too large to actually run. One solution to this problem is to select a relatively large, but reasonable, number of predictors based on bivariate correlations and then drop predictors as necessary to create a regression model that is tractable using the complete data. This set of regression models form the imputation model for the entire dataset. This presentation outlines this approach in more detail and presents an overview of the Stata package that implements it.

MoreLess

Year of publication:	2007-10-31
Authors:	Medeiros, Rose
Institutions:	Stata User Group

freely available

Full text |

More access options

Check Google Scholar

In German libraries (KVK)

I need help

More details

Extent:	application/pdf
Series:	West Coast Stata Users' Group Meetings 2007.
Type of publication:	Book / Working Paper
Language:	English
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10005028076