On Hierarchical Multiple Imputation Method for Handling Missing Data

Sheikhi, Ayyub; Arabpour, Alireza; Mohsen, Khosravi; Mashinchi, Mashallah; Pourmousa, Reza; Rezapour, Mohsen; Roastami, Mohammad Javad; Abbdollah Nejad, Amin; Badakhshan, Abed

doi:10.22103/jmmrc.2021.17749.1153

On Hierarchical Multiple Imputation Method for Handling Missing Data

Document Type : Research Paper

Authors

¹ Department of Statistics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran

² Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran. and Kerman Chamber of Commerce, Industries, Mines and Agriculture, Kerman, Iran

³ Kerman Chamber of Commerce, Industries, Mines and Agriculture, Kerman, Iran

10.22103/jmmrc.2021.17749.1153

Abstract

In this work we carry out a multiple imputation technique for handling missing observations. We propose an algorithm, which performs a hierarchical multiple imputation using edition rules to impute missing values. We assess our algorithm using a simulation study and a numerical application of our algorithm in dataset of Kerman Chamber of Commerce, Industries, Mines and Agriculture is presented for more illustration.

Keywords

20.1001.1.22517952.2021.10.2.8.9

References

[1] Charu C Aggarwal and Saket Sathe. Outlier ensembles: An introduction. Springer, 2017.
[2] Malik Agyemang, Ken Barker, and Rada Alhajj. A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis, 10(6):521{538, 2006.
[3] Zohreh Akbari and Rainer Unland. Automated determination of the input parameter of dbscan based on outlier detection. In IFIP International Conference on Arti cial Intelligence Applications and Innovations, pages 280{291. Springer, 2016.
[4] Krishnan Bhaskaran and Liam Smeeth. What is the di erence between missing completely at random and missing at random? International Journal of Epidemiology, 43(4):1336{1339, 2014.
[5] Nicole M Butera, Siying Li, Kelly R Evenson, Chongzhi Di, David M Buchner, Michael J LaMonte, Andrea Z LaCroix, and Amy Herring. Hot deck multiple imputation for handling missing accelerometer data. Statistics in Biosciences, 11(2):422{448, 2019.
[6] S van Buuren and Karin Groothuis-Oudshoorn. mice: Multivariate imputation by chained equations in r. Journal of statistical software, pages 1{68, 2010.
[7] James R Carpenter, Michael G Kenward, and Ian R White. Sensitivity analysis after multiple imputation under missing at random: a weighting approach. Statistical methods in medical research, 16(3):259{275, 2007.
[8] Ya Chen, Yongjun Li, Huaqing Wu, and Liang Liang. Data envelopment analysis with missing data: A multiple linear regression analysis approach. International Journal of Information Technology & Decision Making, 13(01):137{153, 2014.
[9] Zhangyu Cheng, Chengming Zou, and Jianwei Dong. Outlier detection using isolation forest and local outlier factor. In Proceedings of the conference on research in adaptive and convergent systems, pages 161{168, 2019.
[10] Tamraparni Dasu and Theodore Johnson. Exploratory data mining and data cleaning. John Wiley & Sons, 2003.
[11] Ivan P Fellegi and David Holt. A systematic approach to automatic edit and imputation. Journal of the American Statistical Association, 71(353):17{35, 1976.
[12] Gary Fraser and Ru Yan. Guided multiple imputation of missing data: using a subsample to strengthen the missing-at-random assumption. Epidemiology, pages 246{252, 2007.
[13] Alex A Freitas. Data mining and knowledge discovery with evolutionary algorithms. Springer Science & Business Media, 2013.
[14] Salvador Garca, Julian Luengo, and Francisco Herrera. Data preprocessing in data mining. Springer, 2015.
[15] Benjamin Yael Gravesteijn, Charlie Aletta Sewalt, Esmee Venema, Daan Nieboer, Ewout W Steyerberg, and CENTER-TBI Collaborators. Missing data in prediction research: A ve-step approach for multiple imputation, illustrated in the center-tbi study. Journal of neurotrauma, 38(13):1842{1857, 2021.
[16] Simon Grund, Oliver Ludtke, and Alexander Robitzsch. Multiple imputation of missing data in multilevel models with the r package mdmb: a exible sequential modeling approach. Behavior Research Methods, pages 1{19, 2021.
[17] Julie Josse and Francois Husson. Handling missing values in exploratory multivariate data analysis methods. Journal de la Societe Francaise de Statistique, 153(2):79{99, 2012.
[18] Hyun Kang. The prevention and handling of the missing data. Korean Journal of Anes-thesiology, 64(5):402, 2013.
[19] Shahidul Islam Khan and Abu Sayed Md Latiful Hoque. Sice: an improved missing data imputation technique. Journal of Big Data, 7(1):1{21, 2020.
[20] Hang J Kim, Alan F Karr, and Jerome P Reiter. Statistical disclosure limitation in the presence of edit rules. Journal of Ocial Statistics, 31(1):121{138, 2015.
[21] Sang Kyu Kwak and Jong Hae Kim. Statistical data preparation: management of missing values and outliers. Korean Journal of Anesthesiology, 70(4):407, 2017.
[22] Roderick JA Little and Donald B Rubin. Statistical analysis with missing data, volume 793. John Wiley & Sons, 2019.
[23] Daniel McNeish. Missing data methods for arbitrary missingness with small samples. Journal of Applied Statistics, 44(1):24{39, 2017.
[24] Jared S Murray et al. Multiple imputation: a review of practical and theoretical ndings. Statistical Science, 33(2):142{159, 2018.
[25] Irfan Pratama, Adhistya Erna Permanasari, Igi Ardiyanto, and Rini Indrayani. A review of missing values handling methods on time-series data. In 2016 International Conference on Information Technology Systems and Innovation (ICITSI), pages 1{6. IEEE, 2016.
[26] Burim Ramosaj and Markus Pauly. Predicting missing values: a comparative study on non-parametric approaches for imputation. Computational Statistics, 34(4):1741{1764, 2019.
[27] Peter J Rousseeuw and Mia Hubert. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):73{79, 2011.
[28] Donald B Rubin. Multiple imputation after 18+ years. Journal of the American statistical Association, 91(434):473{489, 1996.
[29] Donald B Rubin. Multiple imputation for nonresponse in surveys, volume 81. John Wiley & Sons, 2004.
[30] Akiyo Sasaki-Otomaru, Kotaro Yamasue, Osamu Tochikubo, Kyoko Saito, and Masahiko Inamori. Association of home blood pressure with sleep and physical and mental activity, assessed via a wristwatch-type pulsimeter with accelerometer in adults. Clinical and Experimental Hypertension, 42(2):131{138, 2020.
[31] Joseph L Schafer. Analysis of incomplete multivariate data. CRC press, 1997.
[32] Joseph L Schafer and Maren K Olsen. Multiple imputation for multivariate missing-data problems: A data analyst's perspective. Multivariate behavioral research, 33(4):545{571,1998.
[33] Shaun Seaman, John Galati, Dan Jackson, and John Carlin. What is meant by "missing at random"? Statistical Science, 1:257{268, 2013.
[34] Ronald E Shier. Maximum z scores and outliers. The American Statistician, 42(1):79{80, 1988.
[35] K Shobha and S Nickolas. Imputation of multivariate attribute values in big data. In Smart intelligent computing and applications, pages 53{60. Springer, 2019.

Volume 10, Issue 2
Special Issue Dedicated to Professor M. Radjabalipour on the occasion of his 75th birthday. (Guest Editors: Dr. Alireza Bahrampour, Dr. Davood Khojasteh Salkouyeh)
October 2021
Pages 103-114

On Hierarchical Multiple Imputation Method for Handling Missing Data

References

Volume 10, Issue 2
Special Issue Dedicated to Professor M. Radjabalipour on the occasion of his 75th birthday. (Guest Editors: Dr. Alireza Bahrampour, Dr. Davood Khojasteh Salkouyeh)
October 2021
Pages 103-114

Files

History

Share

How to cite

Statistics

On Hierarchical Multiple Imputation Method for Handling Missing Data

References

Volume 10, Issue 2Special Issue Dedicated to Professor M. Radjabalipour on the occasion of his 75th birthday. (Guest Editors: Dr. Alireza Bahrampour, Dr. Davood Khojasteh Salkouyeh)October 2021Pages 103-114

Files

History

Share

How to cite

Statistics

Volume 10, Issue 2
Special Issue Dedicated to Professor M. Radjabalipour on the occasion of his 75th birthday. (Guest Editors: Dr. Alireza Bahrampour, Dr. Davood Khojasteh Salkouyeh)
October 2021
Pages 103-114