GKRR: A gravitational-based kernel ridge regression for software development effort estimation

Document Type : Research Paper

Authors

1 Department of Computer Engineering, Lorestan University, Khorramabad, Iran

2 Department of Computer Engineering, Faculty of Engineering, Yazd University, Yazd, Iran

3 Department of Computer Engineering, Technical and Vocational University (TVU), Tehran , Iran

Abstract

Software Development Effort Estimation (SDEE) can be interpreted as a set of efforts to produce a new software system. To increase the estimation accuracy, the researchers tried to provide various machine learning regressors for SDEE. Kernel Ridge Regression (KRR) has demonstrated good potentials to solve regression problems as a powerful machine learning technique. Gravitational Search Algorithm (GSA) is a metaheuristic method that seeks to find the optimal solution in complex optimization problems among a population of solutions. In this article, a hybrid GSA algorithm is presented that combines Binary-valued GSA (BGSA) and the real-valued GSA (RGSA) in order to optimize the KRR parameters and select the appropriate subset of features to enhance the estimation accuracy of SDEE. Two benchmark datasets are considered in the software projects domain for assessing the performance of the proposed method and similar methods in the literature. The experimental results on Desharnais and Albrecht datasets have confirmed that the proposed method significantly increases the accuracy of the estimation comparing some recently published methods in the literature of SDEE.

Keywords


[1] R. de A. Araujo, A.L.I. Oliveira, S. Meira, A class of hybrid multilayer perceptrons for software development effort  estimation problems, Artif. Intell. Rev vol., no. 90 (2017) 1{12.
[2] M. Abdel-Basset, W. Ding, D. El-Shahat, A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection, Expert Syst. Appl vol., no. 54 (2021) 593{637.
[3] R. Abu Khurmaa, I. Aljarah, A. Sharieh, An intelligent feature selection approach based on moth ame optimization for medical diagnosis, Neural Comput. Appl vol., no. 33 (2021) 7165{7204.
[4] R. Ahila, V. Sadasivam, K. Manimala, An integrated PSO for parameter determination and feature selection of ELM and its application in classi cation of power system disturbances, Appl. Soft Comput vol., no. 32 (2015) 23{37.
[5] B. Ahuja, V.P. Vishwakarma, Deterministic Multi-kernel based extreme learning machine for pattern classi cation, Expert Syst. Appl vol., no. 183 (2021) 115308.
[6] M. Al Asheeri, M. Hammad, Improving software cost estimation process using feature selection technique, 3rd Smart Cities Symp, (2021), 89{95.
[7] M. Azzeh, D. Neagu, P. Cowling, Improving analogy software e ort estimation using fuzzy feature subset selection algorithm, 4th Int. Work. Predict. Model. Softw. Eng, (New York, 2008), 71{78.
[8] V.K. Bardsiri, D.N.A. Jawawi, A.K. Bardsiri, E. Khatibi, LMES: A localized multiestimator model to estimate software development e ort, Eng. Appl. Artif. Intell vol.,no. 26 (2013) 2624{2640.
[9] B.W. Boehm, Software Engineering Economics, IEEE Trans. Softw. Eng. Appl vol., no.10 (1984) 4{21.
[10] P.L. Braga, A.L.I. Oliveira, S.R.L. Meira, A GA-based feature selection and parameters optimization for support vector regression applied to software e ort estimation, Proc. 2008 ACM Symp. Appl. Comput, (2008) p. 1788.
[11] L.C. Briand, I. Wieczorek, Resource Estimation in Software Engineering, Encycl. Softw. Eng vol., no. 2 (2021) 1160{1196.
[12] B. Charbuty, A. Abdulazeez, Classi cation Based on Decision Tree Algorithm for Machine Learning, J. Appl. Sci. Technol. Trends vol., no. 2 (2021) 20{28.
[13] E. D, B. B, F. T, D. J, U. J, Parametric estimating handbook, The International Society of Parametric Analysis (ISPA), 2009.
[14] P. Decker, R. Durand, C. O. May eld, C. McCormack, D. Skinner, G. Perdue, Predicting implementation failure in  organization change, Cult. Commun. Con  vol., no. 16 (2012) 29.
[15] J.-M. Desharnais, Statistical analysis on the productivity of data processing with development projects using the function point technique, Universite du Quebec a Montreal, 1988.
[16] J.. Dolado, On the problem of the software cost function, Inf. Softw. Technol vol., no. 43 (2001) 61{72.
[17] H. Dong, L. Yang, Kernel-based regression via a novel robust loss function and iteratively reweighted least squares, Knowl. Inf. Syst vol., no. 43 (2001) 61{72.
[18] M.B. Dowlatshahi, V. Derhami, H. Nezamabadi-Pour, Fuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization, Iran. J. Fuzzy Syst.vol., no.6317 (2021) 1149{1172.
[19] M.B. Dowlatshahi, M. Kuchaki Rafsanjani, B.B. Gupta, An energy aware grouping memetic algorithm to schedule the sensing activity in WSNs-based IoT for smart cities, Appl. Soft Comput vol., no. 108 (2021) 107473.
[20] M.B. Dowlatshahi, H. Nezamabadi-Pour, GGSA: A Grouping Gravitational Search Algorithm for data clustering, Eng. Appl. Artif. Intell vol., no. 36 (2014) 114{121.
[21] M.B. Dowlatshahi, H. Nezamabadi-Pour, M. Mashinchi, A discrete gravitational search algorithm for solving combinatorial optimization problems, Inf. Sci vol., no. 258 (2014)94{107.
[22] P. Edinson, L. Muthuraj, Performance Analysis of FCM Based ANFIS and ELMAN Neural Network in Software E ort Estimation, Int. Arab J. Inf. Technol vol., no. 15 (2018).
[23] M.O. Elish, T. Helmy, M.I. Hussain, Empirical Study of Homogeneous and Heterogeneous Ensemble Models for Software Development E ort Estimation, Math. Probl. Eng vol., no. 2013 (2013) 1{21.
[24] F.-L. Fan, J. Xiong, M. Li, G. Wang, On Interpretability of Arti cial Neural Networks: A Survey, IEEE Trans. Radiat. Plasma Med. Sci (2021) 741{760.
[25] G.R. Finnie, G.E. Wittig, J.-M. Desharnais, A comparison of software e ort estimation techniques: Using function points with neural networks, case-based reasoning and regression models, J. Syst. Softw vol., no. 39 (1997) 281{289.
[26] Galorath, D. D, M.W. Evans, Software sizing, estimation, and risk management: when performance is measured performance improves, Auerbach Publications, 2006.
[27] M.R. Garey, D.S. Johnson, COMPUTERS AND INTRACTABILITY: A Guide to the Theory of NP-Completeness, San Francisco: freeman, 1979.
[28] Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, Rui Zhang Variable Neighborhood Search, Handbook of metaheuristics. Springer, Cham (2019) 57{97.
[29] P. Hansen, N. Mladenovic, J. Brimberg, J.A.M. Perez, A comparison of software effort estimation techniques: Using function points with neural networks, case-based reasoning and regression models, J. Syst. Softw vol., no. 39 (1997) 281{289.
[30] A. Hashemi, M. Bagher Dowlatshahi, H. Nezamabadi-pour, VMFS: A VIKOR-based multi-target feature selection, Expert Syst. Appl (2021) 115224.
[31] A. Hashemi, M. Bagher Dowlatshahi, H. Nezamabadi-pour, MFS-MCDM: Multi-label feature selection using multicriteria decision making, Knowledge-Based Syst (2020) 106365.
[32] A. Hashemi, M.B. Dowlatshahi, H. Nezamabadi-pour, Gravitational Search Algorithm, in: Handb. AI-Based Metaheuristics, CRC Press, 2021.
[33] A. Hashemi, M. Bagher Dowlatshahi, H. Nezamabadi-pour, Ensemble of feature selection algorithms: a multi-criteria decision-making approach, Int. J. Mach. Learn. Cybern vol., no. 13 (2022) 49{69.
[34] A. Hashemi, M. Bagher Dowlatshahi, H. Nezamabadi-pour A bipartite matching-based feature selection for multilabel learning, Int. J. Mach. Learn. Cybern vol., no. 12 (2020) 459{475.
[35] P. He, J.-K. Hao, Iterated two-phase local search for the colored traveling salesmen problem, Eng. Appl. Artif. Intell vol., no. 97 (2021) 104018.
[36] D.E. Holland, R.J. Olesen, J.E. Bevins, Multi-objective genetic algorithm optimization of a directionally sensitive radiation detection system using a surrogate transport model, Eng. Appl. Artif. Intell. Cybern vol., no. 104 (2021) 104357.
[37] J.H. Holland, Outline for a Logical Theory of Adaptive Systems, J. ACM vol., no. 9 (1962) 297{314.
[38] J. Huang, Y.-F. Li, M. Xie, An empirical analysis of data preprocessing for machine learning-based software cost estimation, Inf. Softw. Technol vol., no. 67 (2015) 108{127.
[39] R. Israr Ur, A. Zul qar, J. Zahoor, An Empirical Analysis on Software Development E orts Estimation in Machine Learning Perspective, ADCAIJ Adv. Distrib. Comput.Artif. Intell. J vol., no. 10 (2021) 227{240.
[40] G. Bin Huang, C.K. Slew, Extreme learning machine: RBF network case, 8th Int. Conf. Control. Autom. Robot. Vis., IEEE, (Kunming, 2004), 1029{1036.
[41] A. Arabipour, M. Amini, A weighted linear regression model for impercise response, J. Mahani Math. Res vol., no. 3(2014), 1{17.
[42] K. Korenaga, A. Monden, Z. Yucel, Data Smoothing for Software E ort Estimation, 20th IEEE/ACIS Int. Conf. Softw. Eng. Artif. Intell. Netw. Parallel/Distributed Comput, (Toyama, 2019), 501{506.
[43] C. Li, X. An, R. Li, A chaos embedded GSA-SVM hybrid system for classi cation, Neural Comput. Appl vol., no. 26 (2015) 713{721.
[44] B. Liang, Y. Zhao, Y. Li, A hybrid particle swarm optimization with crisscross learning strategy, Eng. Appl. Artif. Intell vol., no. 105 (2021) 104418.
[45] Q. Liu, J. Xiao, H. Zhu, Feature selection for software e ort estimation with localized neighborhood mutual information, Cluster Comput vol., no. 22 (2019) 6953{6961.
[46] P. MacDonell, Stephen; Whigham, Data Quality in Empirical Software Engineering: An Investigation of Time-Aware Models in Software E ort Estimation, University of Otago (2016) 1155{1166.
[47] A. Moradbeiky, V. Khatibi, M. Jafari Shahbazzadeh, 3LEE: A 3-Layer E ort Estimator for Software Projects, Int. J. Ind. Electron. Control Optim vol., no. 5 (2022) 31{42.
[48] A.L.I. Oliveira, P.L. Braga, R.M.F. Lima, M.L. Cornelio, GA-based method for feature selection and parameters optimization for machine learning regression applied to software e ort estimation, Inf. Softw. Technol vol., no. 52 (2010) 1155{1166.
[49] K. ONO, M. TSUNODA, A. MONDEN, K. MATSUMOTO, In uence of Outliers on Estimation Accuracy of Software Development E ort, IEICE Trans. Inf. Syst vol., no. 104 (2021) 91{105.
[50] M. Paniri, M.B. Dowlatshahi, H. Nezamabadi-pour, Ant-TD: Ant colony optimization plus temporal di erence reinforcement learning for multi-label feature selection, Swarm Evol. Comput vol., no. 64 (2021) 713{721.
[51] D.A. Pisner, D.M. Schnyer, Support vector machine, Mach. Learn., Elsevier, 2020.
[52] E. Praynlin, Using meta-cognitive sequential learning Neuro-fuzzy inference system to estimate software development e ort, J. Ambient Intell. Humaniz. Comput vol., no. 12 (2021) 8763{8776.
[53] L.H. Putnam, A General Empirical Solution to the Macro Software Sizing and Estimating Problem, IEEE Trans. Softw. Eng vol., no. 4 (1978) 345{361.
[54] F. Qi, X.-Y. Jing, X. Zhu, X. Xie, B. Xu, S. Ying, Software e ort estimation based on open source projects: Case study of Github, Inf. Softw. Technol vol., no. 92 (2017) 145{157.
[55] C.R. Rao, GENERALIZED INVERSE OF A MATRIX AND ITS APPLICATIONS, Theory Stat., University of California Press, 1972.
[56] E. Rashedi, H. Nezamabadi-pour, S. Saryazdi, GSA: A Gravitational Search Algorithm, Inf. Sci vol., no. 179 (2009) 2232{2248.
[57] E. Rashedi, H. Nezamabadi-pour, S. Saryazdi, Filter modeling using gravitational search algorithm, Eng. Appl. Artif. Intell vol., no. 24 (2011) 117{122.
[58] M. Relich, P. Pawlewski, A case-based reasoning approach to cost estimation of new product development, Neurocomputing vol., no. 272 (2018) 40{45.
[59] S.H. Samareh Moosavi, V. Khatibi Bardsiri, Satin bowerbird optimizer: A new optimization algorithm to optimize ANFIS for software development e ort estimation, Eng. Appl. Artif. Intell vol., no. 60 (2017) 1{15.
[60] J.S. Sartakhti, M.H. Zangooei, K. Mozafari, Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA), Comput. Methods Programs Biomed vol., no. 108 (2012) 570{579.
[61] S. Sundaram, P. Kellnhofer, Y. Li, J.-Y. Zhu, A. Torralba, W. Matusik, Learning the signatures of the human grasp using a scalable tactile glove, Nature vol., no. 569 (2019) 698{702.
[62] P. Suresh Kumar, H.S. Behera, A.K. K, J. Nayak, B. Naik, Advancement from neural networks to deep learning in software e ort estimation: Perspective of two decades, Comput. Sci. Rev vol., no. 38 (2020) 100288.
[63] Z. Tao, L. Huiling, W. Wenwen, Y. Xia, GA-SVM based feature selection and parameter optimization in hospitalization expense modeling, Appl. Soft Comput vol., no. 75 (2019) 323{332.
[64] B. Venkatesh, J. Anuradha, A review of Feature Selection and its methods, Cybern. Inf. Technol vol., no. 19 (2019) 3{26.
[65] Z.H. Wani, S.M.K. Quadri, Arti cial Bee Colony-Trained Functional Link Arti cial Neural Network Model for Software Cost Estimation, Proceedings of Fifth International Conference on Soft Computing for Problem Solving. Springer, Singapore, (2016) 729{741.
[66] J. Wen, S. Li, Z. Lin, Y. Hu, C. Huang, A Systematic literature review of machine learning based software development e ort estimation models, Inf. Softw. Technol vol., no. 54 (2012) 41{59.
[67] A. ZAKRANI, M. HAIN, A. IDRI, A review of Feature Selection and its methods, IAES Int. J. Artif. Intell vol., no. 8 (2019) 3{26.
[68] N. Zeng, H. Qiu, Z. Wang, W. Liu, H. Zhang, Y. Li, A review of Feature Selection and its methods, Neurocomputing vol., no. 320 (2018) 195{202.
[69] N. Zeng, H. Qiu, Z. Wang, W. Liu, H. Zhang, Y. Li, Feature selection with multi-view data: A survey, Inf. Fusion vol., no. 50 (2019) 158{167.
[70] Y. Zhou, J.-K. Hao, Tabu search with graph reduction for  nding maximum balanced bicliques in bipartite graphs, Eng. Appl. Artif. Intell vol., no. 77 (2019) 86{97.
Volume 11, Issue 3 - Serial Number 23
Special Issue dedicated to Prof. Mashaallah Mashinchi.
November 2022
Pages 147-174
  • Receive Date: 07 February 2022
  • Revise Date: 07 July 2022
  • Accept Date: 25 July 2022