- 작성일
- 2025.09.15
- 수정일
- 2025.09.15
- 작성자
- 최용석
- 조회수
- 2
(2025)A Study on Methods for Imputation of Missing Values in Survey Response, Journal of The Korean Data Analysis Society, 27(5).
Several statistical imputation methods can be applied to deal with the problem of missing values in survey responses, which reduce the efficiency of the survey research in terms of time and cost. In particular, this study intends to apply a total of four analysis techniques, including MICE, MissForest, and Random Forest, which are used in many studies to deal with missing values, and XGBoost, which has been actively dealt with in recent machine learning studies.
In addition, we want to identify the imputation method for missing values with excellent performance so that they can be practically applied in the survey research in the future, and measure the optimal missing rate that can be statistically imputed when missing occurs. Performance evaluation is performed in three stages: comparison through visualization of distribution between missing and real values, measurement of accuracy and NRMSE, and finally, paired t-test to verify statistical significance.
As a result of three empirical analyses, there was a difference in the analysis methods showing excellent performance, so one analysis method could not be specified, suggesting that there is a difference depending on the characteristics of the data
Keywords : missing value, MICE, MissForest, Random Forest, XGBoost
- 첨부파일
- 첨부파일이(가) 없습니다.