Application of Multiple Imputation on the Statistical Analysis of Incomplete Data
Date Issued
2010
Date
2010
Author(s)
Liu, Pi-Lin
Abstract
This paper investigates the application of multiple imputation on the statistical analysis of incomplete data. Many statistical analysis methods are designed and applicable only to complete data, and the incomplete data must be amended to meet the requirement.
Rubin (1987) proposed the method of multiple imputation by substituting m>1 possible values for each missing data. The resulting m sets of complete data are then subject to ordinary statistical analyses. The analysis results of these m sets of imputed completed data are combined together to provide for 5%, 10%, 15% and 20% missing proportions, and compared the analysis results with those of the original complete data.
Simulations in this paper were divided into 3 parts. The first is for the estimation of population parameters such as regression analysis and logistic regression. The second is for multivariate statistical analysis for multivariate normally distributed data. The third is about the covariance structures of multivariate data.
Results from the first part of simulation showed that the discrepancies of parameter estimates between complete data and incomplete data are proportional to missing proportion for regression analysis, but less obvious for logistic regression. Results from the second parts of simulations indicated that the factor analysis is most sensitive to missing proportion. Results from the third parts of simulations revealed that most of the variance structures studied in this paper are also robust to missing proportion.
Subjects
Multiple Imputation
Incomplete Data
Missing Data
Missing at Random
Markov Chain Monte Carlo
File(s)![Thumbnail Image]()
Loading...
Name
ntu-98-R97621203-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):afad919b05dadeaf8a0c3aa929f1b4d9
