Enhanced Breast Cancer Risk Classification Through Genetic Algorithm-Based Feature Selection and Machine Learning Techniques
Year 2025,
Volume: 46 Issue: 2, 369 - 376, 30.06.2025
Aynur Yonar
,
Harun Yonar
,
Öznur Özaltın
Abstract
Breast cancer remains one of the leading causes of mortality among women worldwide and represents a major global health challenge. Accurate classification of breast tumors as benign or malignant is therefore of critical importance for timely diagnosis and effective treatment. This study aims to enhance breast cancer risk classification by integrating machine learning (ML) techniques with a genetic algorithm-based feature selection method. Initially, multiple ML algorithms are applied to features extracted from digitized images obtained through fine-needle aspiration (FNA) of breast masses. Subsequently, a genetic algorithm-based feature selection approach is employed to identify a subset of the most discriminative features. The results demonstrate that ML models utilizing the feature subsets selected by the genetic algorithm consistently achieve higher classification accuracy compared to their baseline counterparts. This highlights the effectiveness of the proposed feature selection strategy in improving the discriminative capacity of ML models. Beyond the observed improvements in accuracy, the refined ML models developed in this study show potential for more precise and reliable breast cancer diagnoses. By enhancing the performance of ML-based decision support systems, the genetic algorithm-based feature selection approach may contribute to the advancement of personalized treatment strategies in breast cancer care.
References
- [1] Lacoviello, L., Bonaccio, M., de Gaetano, G., and Donati, M. B. Epidemiology Of Breast Cancer, A Paradigm Of The “Common Soil” Hypothesis, Seminars In Cancer Biology, (2021) 4-10.
- [2] Siegel, R. L., Miller, K. D., and Jemal, A., Cancer Statistics, CA: A Cancer Journal For Clinicians, 69 (1) (2019), 7-34.
- [3] WHO. Available at https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Retrieved March 10,2023.
- [4] Mridha, M. F., Hamid, M. A., Monowar, M. M., Keya, A. J., Ohi, A. Q., Islam, M. R., and Kim, J.-M., A Comprehensive Survey On Deep-Learning-Based Breast Cancer Diagnosis, Cancers, 13 (23) (2021) 6116.
- [5] Mehrotra, D., Basics Of Artificial Intelligence & Machine Learning, Notion Press, (2019).
- [6] Segal, M. R., Machine Learning Benchmarks And Random Forest Regression, (2004).
- [7] Blum, A. L. and Langley, P., Selection Of Relevant Features And Examples In Machine Learning, Artificial Intelligence, 97 (1-2) (1997) 245-271.
- [8] Chen, R.-C., Dewi, C., Huang, S.-W., and Caraka, R. E., Selecting Critical Features For Data Classification Based On Machine Learning Methods, Journal of Big Data, 7 (1) (2020) 52.
- [9] Ye, Z., Xu, Y., He, Q., Wang, M., Bai, W., and Xiao, H., Feature Selection Based on Adaptive Particle Swarm Optimization With Leadership Learning, Computational Intelligence Neuroscience, (2022).
- [10] Ghosh, A., Datta, A., and Ghosh, S., Self-Adaptive Differential Evolution For Feature Selection In Hyperspectral Image Data, Applied Soft Computing, 13 (4) (2013) 1969-1977.
- [11] Zhang, L., Mistry, K., Lim, C. P., and Neoh, S. C., Feature Selection Using Firefly Optimization For Classification And Regression Models, Decision Support Systems, 106 (2018) 64-85.
- [12] Baig, M. Z., Aslam, N., Shum, H. P., and Zhang, L., Differential Evolution Algorithm As A Tool For Optimal Feature Subset Selection In Motor Imagery EEG, Expert Systems With Applications, 90 (2017) 184-195.
- [13] Sindhu, R., Ngadiran, R., Yacob, Y. M., Zahri, N. A. H., and Hariharan, M., Sine–Cosine Algorithm For Feature Selection With Elitism Strategy And New Updating Mechanism, Neural Computing Applications, 28 (2017) 2947-2958.
- [14] Mafarja, M. and Mirjalili, S., Whale Optimization Approaches for Wrapper Feature Selection, Applied Soft Computing, 62 (2018) 441-453.
- [15] Abdel-Basset, M., El-Shahat, D., El-Henawy, I., De Albuquerque, V. H. C., and Mirjalili, S., A New Fusion Of Grey Wolf Optimizer Algorithm With A Two-Phase Mutation For Feature Selection, Expert Systems with Applications, 139 (2020) 112824.
- [16] Sehgal, A., Sehgal, M., La, H. M., and Bebis, G., Deep Learning Hyperparameter Optimization for Breast Mass Detection in Mammograms, International Symposium on Visual Computing, Cham: Springer Nature Switzerland, (2022) 270–283.
- [17] Yaqoob, A., Verma, N. K., Aziz, R. M., and Shah, M. A., RNA-Seq Analysis for Breast Cancer Detection: A Study on Paired Tissue Samples Using Hybrid Optimization and Deep Learning Techniques, Journal of Cancer Research and Clinical Oncology, 150(10) (2024) 455.
- [18] Boumaraf, S., Liu, X., Ferkous, C., and Ma, X., A New Computer‐Aided Diagnosis System with Modified Genetic Feature Selection for BI‐RADS Classification of Breast Masses in Mammograms, BioMed Research International, 2020(1) (2020) 7695207.
- [19] UCI, Available at https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic Retrieved January 28, 2024.
- [20] Maalouf, M. and Strategies, Logistic Regression in Data Analysis: An Overview, International Journal of Data Analysis Techniques, 3 (3) (2011) 281-299.
- [21] ZhuParris, A., de Goede, A. A., Yocarini, I. E., Kraaij, W., Groeneveld, G. J., and Doll, R. J., Machine Learning Techniques for Developing Remotely Monitored Central Nervous System Biomarkers Using Wearable Sensors: A Narrative Literature Review, Sensors, 23 (11) (2023) 5243.
- [22] Dillen, A., Lathouwers, E., Miladinović, A., Marusic, U., Ghaffari, F., Romain, O., Meeusen, R., and De Pauw, K., A Data-Driven Machine Learning Approach For Brain-Computer Interfaces Targeting Lower Limb Neuroprosthetics, Frontiers In Human Neuroscience, 16 (2022) 949224.
- [23] Awadallah, M. A., Abu-Doush, I., Al-Betar, M. A., and Braik, M. S. Metaheuristics For Optimizing Weights, Neural Networks In Comprehensive Metaheuristics, Elsevier, (2023) 359-377.
- [24] Fawagreh, K., Gaber, M. M., and Elyan, E., Random Forests: From Early Developments To Recent Advancements, Systems Science Control Engineering: An Open Access Journal, 2 (1) (2014) 602-609.
- [25] Yu, X., Chum, P., and Sim, K.-B., Analysis The Effect Of PCA For Feature Reduction In Non-Stationary EEG Based Motor Imagery Of BCI System, Optik 125 (3) (2014) 1498-1502.
- [26] Guo, X., Wu, X., Gong, X., and Zhang, L. (Year) Envelope Detection Based On Online ICA Algorithm And Its Application To Motor Imagery Classification. in:2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), (2013) 1058-1061 IEEE.
- [27] Chandrashekar, G. and Sahin, F., A Survey On Feature Selection Methods. Computers Electrical Engineering, 40 (1) (2014) 16-28.
- [28] Aalaei, S., Shahraki, H., Rowhanimanesh, A., and Eslami, S., Feature Selection Using Genetic Algorithm For Breast Cancer Diagnosis: Experiment On Three Different Datasets, Iranian journal of basic medical sciences, 19 (5) (2016) 476.
- [29] Leardi, R., Boggia, R., and Terrile, M., Genetic Algorithms As A Strategy For Feature Selection, Journal of chemometrics, 6 (5) (1992) 267-281.
- [30] Sakri, S. B., Rashid, N. B. A., and Zain, Z. M., Particle Swarm Optimization Feature Selection For Breast Cancer Recurrence Prediction, IEEE Access, 6 (2018) 29637-29647.
- [31] Wang, P., Xue, B., Liang, J., and Zhang, M., Multiobjective Differential Evolution For Feature Selection In Classification, IEEE Transactions on Cybernetics, (2021)
- [32] Hancer, E., Xue, B., Karaboga, D., and Zhang, M., A Binary ABC Algorithm Based On Advanced Similarity Scheme For Feature Selection, Applied Soft Computing, 36 (2015) 334-348.
- [33] Rostami, O. and Kaveh, M., Optimal Feature Selection For SAR Image Classification Using Biogeography-Based Optimization (BBO), Artificial Bee Colony (ABC) And Support Vector Machine (SVM): A Combined Approach Of Optimization And Machine Learning. Computational Geosciences, 25 (2021) 911-930.
- [34] Holland, J. H. (1975) Adaptation In Natural And Artificial Systems The University of Michigan Press.
- [35] Talbi, E.-G., Metaheuristics: From Design To Implementation, John Wiley & Sons, 2009.
- [36] Yang, X.-S., Engineering Optimization: An Introduction with Metaheuristic Applications, John Wiley & Sons,(2010).
- [37] Tan, M. S., Tan, J. W., Chang, S.-W., Yap, H. J., Kareem, S. A., and Zain, R. B., A Genetic Programming Approach to Oral Cancer Prognosis, PeerJ, 4 (2016).
- [38] Sharma, A., Kulshrestha, S., and Daniel, S. B., Machine Learning Approaches for Cancer Detection, International Journal of Engineering and Manufacturing (IJEM), 8(2) (2018) 45–55.
- [39] Sidey-Gibbons, J. A. M., and Sidey-Gibbons, C. J., Machine Learning in Medicine: A Practical Introduction, BMC Medical Research Methodology, 19(1) (2019) 64.
Year 2025,
Volume: 46 Issue: 2, 369 - 376, 30.06.2025
Aynur Yonar
,
Harun Yonar
,
Öznur Özaltın
References
- [1] Lacoviello, L., Bonaccio, M., de Gaetano, G., and Donati, M. B. Epidemiology Of Breast Cancer, A Paradigm Of The “Common Soil” Hypothesis, Seminars In Cancer Biology, (2021) 4-10.
- [2] Siegel, R. L., Miller, K. D., and Jemal, A., Cancer Statistics, CA: A Cancer Journal For Clinicians, 69 (1) (2019), 7-34.
- [3] WHO. Available at https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Retrieved March 10,2023.
- [4] Mridha, M. F., Hamid, M. A., Monowar, M. M., Keya, A. J., Ohi, A. Q., Islam, M. R., and Kim, J.-M., A Comprehensive Survey On Deep-Learning-Based Breast Cancer Diagnosis, Cancers, 13 (23) (2021) 6116.
- [5] Mehrotra, D., Basics Of Artificial Intelligence & Machine Learning, Notion Press, (2019).
- [6] Segal, M. R., Machine Learning Benchmarks And Random Forest Regression, (2004).
- [7] Blum, A. L. and Langley, P., Selection Of Relevant Features And Examples In Machine Learning, Artificial Intelligence, 97 (1-2) (1997) 245-271.
- [8] Chen, R.-C., Dewi, C., Huang, S.-W., and Caraka, R. E., Selecting Critical Features For Data Classification Based On Machine Learning Methods, Journal of Big Data, 7 (1) (2020) 52.
- [9] Ye, Z., Xu, Y., He, Q., Wang, M., Bai, W., and Xiao, H., Feature Selection Based on Adaptive Particle Swarm Optimization With Leadership Learning, Computational Intelligence Neuroscience, (2022).
- [10] Ghosh, A., Datta, A., and Ghosh, S., Self-Adaptive Differential Evolution For Feature Selection In Hyperspectral Image Data, Applied Soft Computing, 13 (4) (2013) 1969-1977.
- [11] Zhang, L., Mistry, K., Lim, C. P., and Neoh, S. C., Feature Selection Using Firefly Optimization For Classification And Regression Models, Decision Support Systems, 106 (2018) 64-85.
- [12] Baig, M. Z., Aslam, N., Shum, H. P., and Zhang, L., Differential Evolution Algorithm As A Tool For Optimal Feature Subset Selection In Motor Imagery EEG, Expert Systems With Applications, 90 (2017) 184-195.
- [13] Sindhu, R., Ngadiran, R., Yacob, Y. M., Zahri, N. A. H., and Hariharan, M., Sine–Cosine Algorithm For Feature Selection With Elitism Strategy And New Updating Mechanism, Neural Computing Applications, 28 (2017) 2947-2958.
- [14] Mafarja, M. and Mirjalili, S., Whale Optimization Approaches for Wrapper Feature Selection, Applied Soft Computing, 62 (2018) 441-453.
- [15] Abdel-Basset, M., El-Shahat, D., El-Henawy, I., De Albuquerque, V. H. C., and Mirjalili, S., A New Fusion Of Grey Wolf Optimizer Algorithm With A Two-Phase Mutation For Feature Selection, Expert Systems with Applications, 139 (2020) 112824.
- [16] Sehgal, A., Sehgal, M., La, H. M., and Bebis, G., Deep Learning Hyperparameter Optimization for Breast Mass Detection in Mammograms, International Symposium on Visual Computing, Cham: Springer Nature Switzerland, (2022) 270–283.
- [17] Yaqoob, A., Verma, N. K., Aziz, R. M., and Shah, M. A., RNA-Seq Analysis for Breast Cancer Detection: A Study on Paired Tissue Samples Using Hybrid Optimization and Deep Learning Techniques, Journal of Cancer Research and Clinical Oncology, 150(10) (2024) 455.
- [18] Boumaraf, S., Liu, X., Ferkous, C., and Ma, X., A New Computer‐Aided Diagnosis System with Modified Genetic Feature Selection for BI‐RADS Classification of Breast Masses in Mammograms, BioMed Research International, 2020(1) (2020) 7695207.
- [19] UCI, Available at https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic Retrieved January 28, 2024.
- [20] Maalouf, M. and Strategies, Logistic Regression in Data Analysis: An Overview, International Journal of Data Analysis Techniques, 3 (3) (2011) 281-299.
- [21] ZhuParris, A., de Goede, A. A., Yocarini, I. E., Kraaij, W., Groeneveld, G. J., and Doll, R. J., Machine Learning Techniques for Developing Remotely Monitored Central Nervous System Biomarkers Using Wearable Sensors: A Narrative Literature Review, Sensors, 23 (11) (2023) 5243.
- [22] Dillen, A., Lathouwers, E., Miladinović, A., Marusic, U., Ghaffari, F., Romain, O., Meeusen, R., and De Pauw, K., A Data-Driven Machine Learning Approach For Brain-Computer Interfaces Targeting Lower Limb Neuroprosthetics, Frontiers In Human Neuroscience, 16 (2022) 949224.
- [23] Awadallah, M. A., Abu-Doush, I., Al-Betar, M. A., and Braik, M. S. Metaheuristics For Optimizing Weights, Neural Networks In Comprehensive Metaheuristics, Elsevier, (2023) 359-377.
- [24] Fawagreh, K., Gaber, M. M., and Elyan, E., Random Forests: From Early Developments To Recent Advancements, Systems Science Control Engineering: An Open Access Journal, 2 (1) (2014) 602-609.
- [25] Yu, X., Chum, P., and Sim, K.-B., Analysis The Effect Of PCA For Feature Reduction In Non-Stationary EEG Based Motor Imagery Of BCI System, Optik 125 (3) (2014) 1498-1502.
- [26] Guo, X., Wu, X., Gong, X., and Zhang, L. (Year) Envelope Detection Based On Online ICA Algorithm And Its Application To Motor Imagery Classification. in:2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), (2013) 1058-1061 IEEE.
- [27] Chandrashekar, G. and Sahin, F., A Survey On Feature Selection Methods. Computers Electrical Engineering, 40 (1) (2014) 16-28.
- [28] Aalaei, S., Shahraki, H., Rowhanimanesh, A., and Eslami, S., Feature Selection Using Genetic Algorithm For Breast Cancer Diagnosis: Experiment On Three Different Datasets, Iranian journal of basic medical sciences, 19 (5) (2016) 476.
- [29] Leardi, R., Boggia, R., and Terrile, M., Genetic Algorithms As A Strategy For Feature Selection, Journal of chemometrics, 6 (5) (1992) 267-281.
- [30] Sakri, S. B., Rashid, N. B. A., and Zain, Z. M., Particle Swarm Optimization Feature Selection For Breast Cancer Recurrence Prediction, IEEE Access, 6 (2018) 29637-29647.
- [31] Wang, P., Xue, B., Liang, J., and Zhang, M., Multiobjective Differential Evolution For Feature Selection In Classification, IEEE Transactions on Cybernetics, (2021)
- [32] Hancer, E., Xue, B., Karaboga, D., and Zhang, M., A Binary ABC Algorithm Based On Advanced Similarity Scheme For Feature Selection, Applied Soft Computing, 36 (2015) 334-348.
- [33] Rostami, O. and Kaveh, M., Optimal Feature Selection For SAR Image Classification Using Biogeography-Based Optimization (BBO), Artificial Bee Colony (ABC) And Support Vector Machine (SVM): A Combined Approach Of Optimization And Machine Learning. Computational Geosciences, 25 (2021) 911-930.
- [34] Holland, J. H. (1975) Adaptation In Natural And Artificial Systems The University of Michigan Press.
- [35] Talbi, E.-G., Metaheuristics: From Design To Implementation, John Wiley & Sons, 2009.
- [36] Yang, X.-S., Engineering Optimization: An Introduction with Metaheuristic Applications, John Wiley & Sons,(2010).
- [37] Tan, M. S., Tan, J. W., Chang, S.-W., Yap, H. J., Kareem, S. A., and Zain, R. B., A Genetic Programming Approach to Oral Cancer Prognosis, PeerJ, 4 (2016).
- [38] Sharma, A., Kulshrestha, S., and Daniel, S. B., Machine Learning Approaches for Cancer Detection, International Journal of Engineering and Manufacturing (IJEM), 8(2) (2018) 45–55.
- [39] Sidey-Gibbons, J. A. M., and Sidey-Gibbons, C. J., Machine Learning in Medicine: A Practical Introduction, BMC Medical Research Methodology, 19(1) (2019) 64.