A new model for lung cancer prediction based on differential evolution algorithm and effective feature selection

Document Type : Research Paper

Author

Computer Engineering Department, Bardsir Branch, Islamic Azad University, Bardsir, Iran

Abstract

Lung cancer is one of the most dangerous and fatal diseases worldwide. By using advanced machine learning techniques and optimization algorithms, early prediction and diagnosis of this disease can be achieved. Early identification of lung cancer is an important approach that can increase the survival rate of patients. In this paper, a novel method for lung cancer prediction is proposed, which combines two important techniques: Support Vector Machine (SVM) and Differential Evolution (DE) algorithm. Firstly, using the differential evolution algorithm, important and suitable features for lung cancer prediction are extracted. Then, using the SVM classifier, a classification model is built for prediction. The proposed approach is implemented on two lung cancer databases and achieves a good level of accuracy, which is compared with four other methods: C4.5 decision tree, neural network, Naive Bayes classifier, and logistic regression. The proposed model, with high accuracy and generalization power, is a suitable model for lung cancer detection and can serve as a strong decision support system alongside medical professionals.

Keywords

Main Subjects


[1] Alharbi, A. (2018). An automated computer system based on genetic algorithm and fuzzy systems for lung cancer diagnosis. International Journal of Nonlinear Sciences and Numerical Simulation, 19(6), 583-594. https://doi.org/10.1515/ijnsns-2017-0048
[2] Alsinglawi, B., Alshari, O., Alorjani, M., Mubin, O., Alnajjar, F., Novoa, M., Darwish, O. (2022). An explainable machine learning framework for lung cancer hospital length of stay prediction. Scienti c reports, 12(1), 1-10. https://doi.org/10.1038/s41598-021-04608-7
[3] Chauhan, A. (2020). Detection of lung cancer using machine learning techniques based on routine blood indices. Paper presented at the 2020 IEEE international conference for innovation in technology (INOCON). https://doi.org/10.1109/INOCON50539.2020.9298407
[4] Chen, H.-L., Huang, C.-C., Yu, X.-G., Xu, X., Sun, X., Wang, G., & Wang, S.-J. (2013). An ecient diagnosis system for detection of Parkinson's disease using fuzzy k-nearest neighbor approach. Expert Systems with Applications, 40(1), 263-271. https://doi.org/10.1016/j.eswa.2012.07.014
[5] Cherif, W. (2018). Optimization of K-NN algorithm by clustering and reliability coecients: application to breast-cancer diagnosis. Procedia Computer Science, 127, 293-299. https://doi.org/10.1016/j.procs.2018.01.125
[6] Faisal, M. I., Bashir, S., Khan, Z. S., Khan, F. H. (2018). An evaluation of machine learning classi ers and ensembles for early stage prediction of lung cancer. Paper presented at the 2018 3rd international conference on emerging trends in engineering, sciences and technology (ICEEST). https://doi.org/10.1109/ICEEST.2018.8643311
[7] Hashi, E. K., Zaman, M. S. U., & Hasan, M. R. (2017). An expert clinical decision support system to predict disease using classi cation techniques. Paper presented at the 2017 International conference on electrical, computer and  communication engineering (ECCE). https://doi.org/10.1109/ECACE.2017.7912937
[8] Liu, W., Liu, X., Luo, X., Wang, M., Han, G., Zhao, X., Zhu, Z. (2023). A pyramid input augmented multi-scale CNN for GGO detection in 3D lung CT images. Pattern Recognition, 136, 109261. https://doi.org/10.1016/j.patcog.2022.109261
[9] Lynch, C. M., Abdollahi, B., Fuqua, J. D., de Carlo, A. R., Bartholomai, J. A., Balgemann, R. N., . . . Frieboes, H. B. (2017). Prediction of lung cancer patient survival via supervised machine learning classi cation techniques. International Journal of Medical Informatics, 108, 1-8. https://doi.org/10.1016/j.ijmedinf.2017.09.013
[10] Maleki, N., Zeinali, Y., Niaki, S. T. A. (2021). A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Systems with Applications, 164, 113-981. https://doi.org/10.1016/j.eswa.2020.113981
[11] Odajima, K., & Pawlovsky, A. P. (2014). A detailed description of the use of the kNN method for breast cancer diagnosis. Paper presented at the 2014 7th International Conference on Biomedical Engineering and Informatics.
https://doi.org/10.1109/BMEI.2014.7002861
[12] Opara, K. R., & Arabas, J. (2019). Di erential Evolution: A survey of theoretical analyses. Swarm and evolutionary computation, 44, 546-558. https://doi.org/10.1016/j.swevo.2018.06.010
[13] Pathoee, K., Rawat, D., Mishra, A., Arya, V., Rafsanjani, M. K., Gupta, A. K. (2022). A cloud-based predictive model for the detection of breast cancer. International Journal of Cloud Applications and Computing (IJCAC), 12(1), 1-12. https://doi.org/10.4018/IJCAC.310041
[14] Patra, R. (2020). Prediction of lung cancer using machine learning classi er. Paper presented at the Computing Science, Communication and Security: First International Conference, COMS2 2020, Gujarat, India, March 26{27, 2020, Revised Selected Papers 1. https://doi.org/10.1155
[15] Price, K. V., Storn, R. M., & Lampinen, J. A. (2005). The di erential evolution algorithm. Di erential evolution: a practical approach to global optimization, 37-134. https://doi.org/10.1007/3-540-31306-0
[16] Puneet, & Chauhan, A. (2020, 6-8 Nov. 2020). Detection of Lung Cancer using Machine Learning Techniques Based on Routine Blood Indices. Paper presented at the 2020 IEEE International Conference for Innovation in Technology (INOCON).
https://doi.org/10.1109/INOCON50539.2020.9298407
[17] Quanyang, W., Yao, H., Sicong, W., Linlin, Q., Zewei, Z., Donghui, H., . . . Shijun, Z. (2024). Arti cial intelligence in lung cancer screening: Detection, classi cation, prediction, and prognosis. Cancer Medicine, 13(7), e7140. https://doi.org/10.1002/cam4.7140
[18] Radhika, P., Nair, R. A., Veena, G. (2019). A comparative study of lung cancer detection using machine learning algorithms. Paper presented at the 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). https://doi.org/10.1109/ICECCT.2019.8869001
[19] Sa yari, A., Javidan, R. (2017). Predicting lung cancer survivability using ensemble learning methods. Paper presented at the 2017 intelligent systems conference (IntelliSys). https://doi.org/10.1109/IntelliSys.2017.8324368
[20] Siddiqui, E. A., Chaurasia, V., Shandilya, M. (2023). Classi cation of lung cancer computed tomography images using a 3-dimensional deep convolutional neural network with multi-layer  lter. Journal of Cancer Research and Clinical Oncology, 149(13), 11279-11294. https://doi.org/10.1007/s00432-023-04992-9
[21] Sim, J.-a., Kim, Y., Kim, J. H., Lee, J. M., Kim, M. S., Shim, Y. M., . . . Yun, Y. H. (2020). The major e ects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning. Scienti c reports, 10(1), 1-12. https://doi.org/10.1038/s41598-020-67604-3
[22] Suthaharan, S. (2016). Support vector machine. Machine learning models and algorithms for big data classi cation: thinking with examples for e ective learning, 207-235. https://doi.org/10.1007/978-1-4899-7641-3
[23] Varchagall, M., Nethravathi, N. P., Chandramma, R., Nagashree, N., & Athreya, S. M. (2023). Using Deep Learning Techniques to Evaluate Lung Cancer Using CT Images. SN Computer Science, 4(2), 173. https://doi.org/10.1007/s42979-022-01587-y
[24] Venkatesh, S. P., & Raamesh, L. (2022). Predicting Lung Cancer Survivability: a Machine Learning Ensemble Method on Seer Data. https://doi.org/10.21203/rs.3.rs-1490914/v1
[25] Vikas, P. K., & Kaur, P. (2021). Lung cancer detection using chi-square feature selection and support vector machine algorithm. International Journal of Advanced Trends in Computer Science and Engineering. https://doi.org/10.30534/ijatcse/2021/801032021
[26] Wu, J., Zan, X., Gao, L., Zhao, J., Fan, J., Shi, H., . . . Xie, X. (2019). A machine learning method for identifying lung cancer based on routine blood indices: qualitative feasibility study. JMIR medical informatics, 7(3), e13476. https://doi.org/10.2196
[27] Yuan, H., Wu, Y., Dai, M. (2023). Multi-Modal Feature Fusion-Based Multi-Branch Classi cation Network for Pulmonary Nodule Malignancy Suspiciousness Diagnosis. Journal of Digital Imaging, 36(2), 617-626. https://doi.org/10.1007/s10278-022-00747-z