Ensemble of semi-supervised feature selection algorithms to reinforce heuristic function in ant colony optimization

Document Type : Research Paper

Authors

Department of Computer Engineering, Lorestan University, Khoramabad, Iran

Abstract

Feature selection (FS) is a well-known dimensionality reduction method that chooses a hopeful subset of the original feature collection to diminish the influence the curse of dimensionality phenomenon. FS improves learning performance by removing irrelevant and redundant features. The significance of semi-supervised learning becomes obvious when labeled instances are not always accessible; however, labeling such data may be costly or time-consuming. Many of the samples in semi-supervised learning are unlabeled. Semi-supervised FS techniques overcome this problem, simultaneously utilizing information from labeled and unlabeled data. This article presents a new semi-supervised FS method called ESACO. ESACO uses a combination of ACO algorithm and a set of heuristics to select the best features. Ant colony optimization algorithm (ACO) is a metaheuristic method for solving optimization problems. Heuristic selection is a significant part of the ACO algorithm that can influence the movements of ants. Utilizing numerous heuristics rather than a single one can improve the performance of the ACO algorithm. However, using multiple heuristics investigates other aspects to attain optimal and better solutions in ACO and provides us with more information. Thus, in the ESACO, we have utilized the ensemble of heuristic functions by integrating them into Multi-Criteria Decision-Making (MCDM) procedure. So far, the utilization of multiple heuristics in ACO has not been studied in semi-supervised FS. We have compared the performance of the ESACO using the KNN classifier with variant experiments with eight semi-supervised FS techniques and 15 datasets. Considering the obtained results, the efficiency of the presented method is significantly better than the competing methods. The article's code link on GitHub can also be found at the following: https://github.com/frshkara/ESACO.

Keywords

Main Subjects


[1] Aruldoss, M. (2013). \A Survey on Multi Criteria Decision Making Methods and Its Applications." 1(1):31{43.
[2] Asghari, V. Z., Soodeh, H., Ebrahimi, S., Javidi, & M. M. (2022). \A New Hybrid Feature Selection Based on Improved Equilibrium Optimization." Chemometrics and Intelligent Laboratory Systems 228:104618. doi: 10.1016/J.CHEMOLAB.2022.104618.
[3] Bayati, Hamid., Dowlatshahi, M. B., & Hashemi, A . (2022). \MSSL: A Memetic-Based Sparse Subspace Learning Algorithm for Multi-Label Classi cation." International Journal of Machine Learning and Cybernetics 2022 1{18. doi: 10.1007/S13042-022-01616-5.
[4] Beiranvand, F., Vahid M., & Dowlatshahi, M. B.. (2022). \Unsupervised Feature Selection for Image Classi cation: A Bipartite Matching-Based Principal Component Analysis Approach." Knowledge-Based Systems 250:109085. doi:
10.1016/J.KNOSYS.2022.109085.
[5] Benabdeslem, K, & Hindawi, M. (2011). \Constrained Laplacian Score for Semi-Supervised Feature Selection." Lecture Notes in Computer Science (Including Subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics) 6911 LNAI:204{18. doi: 10.1007/978-3-642-23780-5 23.
[6] Cai, J,. Jiawei L,. Shulin W, & Sheng Yang. (2018). \Feature Selection in Machine Learning: A New Perspective." Neurocomputing 300:70{79. doi:10.1016/j.neucom.2017.11.077.
[7] Cao, D,. Yuan C,. Jin C,. Hongyan Z,. & and Zheming Y. (2021). \An Improved Algorithm for the Maximal Information Coecient and Its Application." Royal Society Open Science 8(2). doi: 10.1098/RSOS.201424.
[8] Chandrashekar, G,. & Ferat S. (2014). \A Survey on Feature Selection Methods." Computers and Electrical Engineering 40(1):16{28. doi: 10.1016/j.compeleceng.2013.11.024.
[9] Dey, N, ed. (2024a). \Applications of Ant Colony Optimization and Its Variants." doi:10.1007/978-981-99-7227-2.
[10] Dey, N, ed. (2024b). \Applied Multi-Objective Optimization." doi: 10.1007/978-981-97-0353-1.
[11] Dorigo, M,. Birattari, M,. & Stutzle, T. (2006). \Ant Colony Optimatization." Intelligence Magazine, IEEE 321.
[12] Dorigo, M,. & Christian Blum. (2005). \Ant Colony Optimization Theory: A Survey." Theoretical Computer Science 344(2{3):243{78. doi: 10.1016/j.tcs.2005.05.020.
[13] Dowlatshahi, M. B., & Hashemi, A. (2023). \Unsupervised Feature Selection: A Fuzzy Multi-Criteria Decision-Making Approach." Iranian Journal of Fuzzy Systems 20(7):55{70. doi: 10.22111/IJFS.2023.7630.
[14] Du, L,. & Yi D. (2015). \Unsupervised Feature Selection with Adaptive Structure Learning." Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015-August:209{18. doi: 10.48550/arxiv.1504.00736.
[15] van E,. Jesper, E., & Holger H. Hoos. (2020). \A Survey on Semi-Supervised Learning." Machine Learning 109(2):373{440.
[16] Narayanan, G., Shankar, R, Cep, R., Chakraborty, S, & Kanak Kalita. (2023). \Ecient Feature Selection Using Weighted Superposition Attraction Optimization Algorithm." Applied Sciences (Switzerland) 13(5). doi: 10.3390/APP13053223.
[17] Hashemi, A., Dowlatshahi, M. B, & Nezamabadi-pour, H. (2021). \An Ecient Pareto-Based Feature Selection Algorithm for Multi-Label Classi cation." Information Sciences 581:428{47. doi: 10.1016/J.INS.2021.09.052.
[18] Hashemi, A., & Dowlatshahi, M. B.. (2023). \A Fuzzy Integral Approach for Ensembling Unsupervised Feature Selection Algorithms." 2023 28th International Computer Conference, Computer Society of Iran, CSICC 2023. doi: 10.1109/CSICC58665.2023.10105330.
[19] Hashemi, A., Dowlatshahi, M. B, & Nezamabadi-pour, H. (2020). \MGFS: A Multi-Label Graph-Based Feature Selection Algorithm via PageRank Centrality." Expert Systems with Applications 142:113024. doi: 10.1016/J.ESWA.2019.113024.
[20] Hashemi, A., Dowlatshahi, M. B, & Nezamabadi-pour, H. (2021). \Ensemble of Feature Selection Algorithms: A Multi-Criteria Decision-Making Approach." International Journal of Machine Learning and Cybernetics 2021 13:1 13(1):49{69.  doi: 10.1007/S13042-021-01347-Z.
[21] Hashemi, A., Joodaki, M., Joodaki, N. Z, & Dowlatshahi, M. B. (2022). \Ant Colony Optimization Equipped with an Ensemble of Heuristics through Multi-Criteria Decision Making: A Case Study in Ensemble Feature Selection." Applied Soft Computing 124:109046. doi: 10.1016/J.ASOC.2022.109046.
[22] Hashemi, A., Pajoohan, M. R., & Dowlatshahi, M. B. (2022). \Online Streaming Feature Selection Based on Sugeno Fuzzy Integral." 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems, CFIS 2022. doi: 10.1109/CFIS54774.2022.9756477.
[23] Hashemi, A., Pajoohan, M. R., & Dowlatshahi, M. B. (2023). \An Election Strategy for Online Streaming Feature Selection." 2023 28th International Computer Conference, Computer Society of Iran, CSICC 2023. doi: 10.1109/CSICC58665.2023.10105319.
[24] Hindawi, M, Elghazel, H., & Benabdeslem K. (2013). \Ecient Semi-Supervised Feature Selection by an Ensemble Approach." International Workshop on Complex Machine Learning Problems with Ensemble Methods COPEM@ECML/PKDD'13, Prague, Czech Republic. pp.41-55.
[25] Hira, Zena M., & Duncan F. Gillies. (2015). \A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data." Advances in Bioinformatics 2015. doi:10.1155/2015/198363.
[26] Jia, Y., Kwong, S., Hou, J., & Wu, W. (2020). \Semi-Supervised Non-Negative Matrix Factorization with Dissimilarity and Similarity Regularization." IEEE Transactions on Neural Networks and Learning Systems 31(7):2510{21. doi:
10.1109/TNNLS.2019.2933223.
[27] Kanan, H R., Faez, F., & Taheri, M. (2007). \Feature Selection Using Ant Colony Optimization (ACO): A New Method and Comparative Study in the Application of Face Recognition System." Lecture Notes in Computer Science (Including Subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics) 4597 LNCS:63{76. doi: 10.1007/978-3-540-73435-2 6.
[28] Karel, W., Brauers, W., & Zavadskas, E. (2006). "The MOORA method and its application to privatization in a transition economy". Control and Cybernetics. http://eudml.org/doc/209425.
[29] Karimi, F., Dowlatshahi, M. B, & Hashemi, A. (2023). \SemiACO: A Semi-Supervised Feature Selection Based on Ant Colony Optimization." Expert Systems with Applications 214:119130. doi: 10.1016/J.ESWA.2022.119130.
[30] Khalid, S., Khalil, T., & Nasreen, S. (2014). \A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning." Proceedings of 2014 Science and Information Conference, SAI 2014 372{78. doi: 10.1109/SAI.2014.6918213.
[31] Lee, J., R., & Nicewander W. A. (2012). \Thirteen Ways to Look at the Correlation Coecient." American Statistician 42(1):59{66. doi: 10.1080/00031305.1988.10475524.
[32] Li, J., Cheng. K., Wang, S., Morstatter, F., Trevino, R P., Tang, J., & Liu, h. (2017). \Feature Selection: A Data Perspective." ACM Computing Surveys 50(6). doi:10.1145/3136625.
[33] Liao, Y, & Vemuri, V. R. (2002). \Use of K-Nearest Neighbor Classi er for Intrusion Detection." Computers and Security 21(5):439{48. doi: 10.1016/S0167-4048(02)00514-X.
[34] Liu, Y., Nie, F., Wu, J., & Chen, L. (2010). \Semi-Supervised Feature Selection Based on Label Propagation and Subset Selection." Proceedings of ICCIA 2010 - 2010 International Conference on Computer and Information Application 293{96.  doi: 10.1109/ICCIA.2010.6141595.
[35] Ma, Zhigang, Yi Yang, Feiping Nie, Jasper Uijlings, and Nicu Sebe. (2011). \Exploiting the Entire Feature Space with Sparsity for Automatic Image Annotation." MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops 283{92. doi: 10.1145/2072298.2072336.
[36] Mazyavkina, Nina, Sergey Sviridov, Sergei Ivanov, and Evgeny Burnaev. (2021). \Reinforcement Learning for Combinatorial Optimization: A Survey." Computers and Operations Research 134. doi: 10.1016/j.cor.2021.105400.
[37] Miao, Jianyu, and Lingfeng Niu. (2016). \A Survey on Feature Selection." Procedia Computer Science 91:919{26. doi: 10.1016/j.procs.2016.07.111.
[38] Miri, M., Dowlatshahi, M. B, & Hashemi, A. (2022a). \Evaluation Multi Label Feature Selection for Text Classi cation Using Weighted Borda Count Approach." 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems, CFIS 2022. doi:10.1109/CFIS54774.2022.9756467.
[39] Miri, M., Dowlatshahi, M. B, & Hashemi, A. (2022b). \Feature Selection for Multi-Label Text Data: An Ensemble Approach Using Geometric Mean Aggregation." 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems, CFIS 2022. doi:10.1109/CFIS54774.2022.9756484.
[40] Miri, M., Dowlatshahi, M. B, Hashemi, A., Kuchaki., R. M, Gupta, B. B., & Alhalabi, W, S. (2022c). \Ensemble Feature Selection for Multi-Label Text Classi cation: An Intelligent Order Statistics Approach." International Journal of Intelligent Systems. doi:10.1002/INT.23044.
[41] Marta S.R., Dalila, B.M.M., & Fernando A.C.C. (2012). \Ant Colony Optimization: A Literature Survey." FEP Working Papers 474, Universidade do Porto, Faculdade de Economia do Porto.
[42] Pan, W. (2021). \Feature Selection Algorithm Based on Maximum Information Coefficient." IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) 2600{2603. doi: 10.1109/IAEAC50856.2021.9390868.
[43] Paniri, M., Dowlatshahi, M. B., & Nezamabadi-pour, H. (2020). \MLACO: A Multi-Label Feature Selection Algorithm Based on Ant Colony Optimization." Knowledge-Based Systems 192:105285. doi: 10.1016/j.knosys.2019.105285.
[44] Paniri, M., Dowlatshahi, M. B., & Nezamabadi-pour, H. (2021). \Ant-TD: Ant Colony Optimization plus Temporal Di erence Reinforcement Learning for Multi-Label Feature Selection." Swarm and Evolutionary Computation 64:100892. doi:
10.1016/j.swevo.2021.100892.
[45] Sechidis, K., & Brown, G. (2018). \Simple Strategies for Semi-Supervised Feature Selection." Machine Learning 107(2):357{95. doi: 10.1007/s10994-017-5648-2.
[46] Sheikhpour, R., Agha Sarram, M., Gharaghani, S., & Zare Chahooki, M.A. (2017). \A Survey on Semi-Supervised Feature Selection Methods." Pattern Recognition 64:141{58. doi: 10.1016/j.patcog.2016.11.003.
[47] Sheikhpour, R., Agha Sarram, M., & Sheikhpour, E. (2018). \Semi-Supervised Sparse Feature Selection via Graph Laplacian Based Scatter Matrix for Regression Problems." Information Sciences 468:14{28. doi: 10.1016/j.ins.2018.08.035.
[48] Sugiyama, M., Ide, T., Nakajima, S., & Sese, J. (2010). \Semi-Supervised Local Fisher Discriminant Analysis for Dimensionality Reduction." Machine Learning 78(1{2):35{61. doi: 10.1007/s10994-009-5125-7.
[49] Venkatesh, B., & J. Anuradha. (2019). \A Review of Feature Selection and Its Methods." Cybernetics and Information Technologies 19(1):3{26. doi: 10.2478/CAIT-2019-0001.
[50] Wu, M., & Scholkopf, B. (2006). \A Local Learning Approach for Clustering." Advances in Neural Information Processing Systems 19.
[51] Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). \A Survey on Evolutionary Computation Approaches to Feature Selection." IEEE Transactions on Evolutionary Computation 20(4):606{26. doi: 10.1109/TEVC.2015.2504420.
[52] Zeng, H., & Cheung, Y. M. (2011). \Feature Selection and Kernel Learning for Local Learning-Based Clustering." IEEE Transactions on Pattern Analysis and Machine Intelligence 33(8):1532{47. doi: 10.1109/TPAMI.2010.215.
[53] Zhu, H., You, X., & Liu, S. (2019). \Multiple Ant Colony Optimization Based on Pearson Correlation Coecient." IEEE Access 7:61628{38. doi: 10.1109/ACCESS.2019.2915673.