Research Article
BibTex RIS Cite

Enhanced Breast Cancer Risk Classification Through Genetic Algorithm-Based Feature Selection and Machine Learning Techniques

Year 2025, Volume: 46 Issue: 2, 369 - 376, 30.06.2025
https://doi.org/10.17776/csj.1443598

Abstract

Breast cancer remains one of the leading causes of mortality among women worldwide and represents a major global health challenge. Accurate classification of breast tumors as benign or malignant is therefore of critical importance for timely diagnosis and effective treatment. This study aims to enhance breast cancer risk classification by integrating machine learning (ML) techniques with a genetic algorithm-based feature selection method. Initially, multiple ML algorithms are applied to features extracted from digitized images obtained through fine-needle aspiration (FNA) of breast masses. Subsequently, a genetic algorithm-based feature selection approach is employed to identify a subset of the most discriminative features. The results demonstrate that ML models utilizing the feature subsets selected by the genetic algorithm consistently achieve higher classification accuracy compared to their baseline counterparts. This highlights the effectiveness of the proposed feature selection strategy in improving the discriminative capacity of ML models. Beyond the observed improvements in accuracy, the refined ML models developed in this study show potential for more precise and reliable breast cancer diagnoses. By enhancing the performance of ML-based decision support systems, the genetic algorithm-based feature selection approach may contribute to the advancement of personalized treatment strategies in breast cancer care.

References

  • [1] Lacoviello, L., Bonaccio, M., de Gaetano, G., and Donati, M. B. Epidemiology Of Breast Cancer, A Paradigm Of The “Common Soil” Hypothesis, Seminars In Cancer Biology, (2021) 4-10.
  • [2] Siegel, R. L., Miller, K. D., and Jemal, A., Cancer Statistics, CA: A Cancer Journal For Clinicians, 69 (1) (2019), 7-34.
  • [3] WHO. Available at https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Retrieved March 10,2023.
  • [4] Mridha, M. F., Hamid, M. A., Monowar, M. M., Keya, A. J., Ohi, A. Q., Islam, M. R., and Kim, J.-M., A Comprehensive Survey On Deep-Learning-Based Breast Cancer Diagnosis, Cancers, 13 (23) (2021) 6116.
  • [5] Mehrotra, D., Basics Of Artificial Intelligence & Machine Learning, Notion Press, (2019).
  • [6] Segal, M. R., Machine Learning Benchmarks And Random Forest Regression, (2004).
  • [7] Blum, A. L. and Langley, P., Selection Of Relevant Features And Examples In Machine Learning, Artificial Intelligence, 97 (1-2) (1997) 245-271.
  • [8] Chen, R.-C., Dewi, C., Huang, S.-W., and Caraka, R. E., Selecting Critical Features For Data Classification Based On Machine Learning Methods, Journal of Big Data, 7 (1) (2020) 52.
  • [9] Ye, Z., Xu, Y., He, Q., Wang, M., Bai, W., and Xiao, H., Feature Selection Based on Adaptive Particle Swarm Optimization With Leadership Learning, Computational Intelligence Neuroscience, (2022).
  • [10] Ghosh, A., Datta, A., and Ghosh, S., Self-Adaptive Differential Evolution For Feature Selection In Hyperspectral Image Data, Applied Soft Computing, 13 (4) (2013) 1969-1977.
  • [11] Zhang, L., Mistry, K., Lim, C. P., and Neoh, S. C., Feature Selection Using Firefly Optimization For Classification And Regression Models, Decision Support Systems, 106 (2018) 64-85.
  • [12] Baig, M. Z., Aslam, N., Shum, H. P., and Zhang, L., Differential Evolution Algorithm As A Tool For Optimal Feature Subset Selection In Motor Imagery EEG, Expert Systems With Applications, 90 (2017) 184-195.
  • [13] Sindhu, R., Ngadiran, R., Yacob, Y. M., Zahri, N. A. H., and Hariharan, M., Sine–Cosine Algorithm For Feature Selection With Elitism Strategy And New Updating Mechanism, Neural Computing Applications, 28 (2017) 2947-2958.
  • [14] Mafarja, M. and Mirjalili, S., Whale Optimization Approaches for Wrapper Feature Selection, Applied Soft Computing, 62 (2018) 441-453.
  • [15] Abdel-Basset, M., El-Shahat, D., El-Henawy, I., De Albuquerque, V. H. C., and Mirjalili, S., A New Fusion Of Grey Wolf Optimizer Algorithm With A Two-Phase Mutation For Feature Selection, Expert Systems with Applications, 139 (2020) 112824.
  • [16] Sehgal, A., Sehgal, M., La, H. M., and Bebis, G., Deep Learning Hyperparameter Optimization for Breast Mass Detection in Mammograms, International Symposium on Visual Computing, Cham: Springer Nature Switzerland, (2022) 270–283.
  • [17] Yaqoob, A., Verma, N. K., Aziz, R. M., and Shah, M. A., RNA-Seq Analysis for Breast Cancer Detection: A Study on Paired Tissue Samples Using Hybrid Optimization and Deep Learning Techniques, Journal of Cancer Research and Clinical Oncology, 150(10) (2024) 455.
  • [18] Boumaraf, S., Liu, X., Ferkous, C., and Ma, X., A New Computer‐Aided Diagnosis System with Modified Genetic Feature Selection for BI‐RADS Classification of Breast Masses in Mammograms, BioMed Research International, 2020(1) (2020) 7695207.
  • [19] UCI, Available at https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic Retrieved January 28, 2024.
  • [20] Maalouf, M. and Strategies, Logistic Regression in Data Analysis: An Overview, International Journal of Data Analysis Techniques, 3 (3) (2011) 281-299.
  • [21] ZhuParris, A., de Goede, A. A., Yocarini, I. E., Kraaij, W., Groeneveld, G. J., and Doll, R. J., Machine Learning Techniques for Developing Remotely Monitored Central Nervous System Biomarkers Using Wearable Sensors: A Narrative Literature Review, Sensors, 23 (11) (2023) 5243.
  • [22] Dillen, A., Lathouwers, E., Miladinović, A., Marusic, U., Ghaffari, F., Romain, O., Meeusen, R., and De Pauw, K., A Data-Driven Machine Learning Approach For Brain-Computer Interfaces Targeting Lower Limb Neuroprosthetics, Frontiers In Human Neuroscience, 16 (2022) 949224.
  • [23] Awadallah, M. A., Abu-Doush, I., Al-Betar, M. A., and Braik, M. S. Metaheuristics For Optimizing Weights, Neural Networks In Comprehensive Metaheuristics, Elsevier, (2023) 359-377.
  • [24] Fawagreh, K., Gaber, M. M., and Elyan, E., Random Forests: From Early Developments To Recent Advancements, Systems Science Control Engineering: An Open Access Journal, 2 (1) (2014) 602-609.
  • [25] Yu, X., Chum, P., and Sim, K.-B., Analysis The Effect Of PCA For Feature Reduction In Non-Stationary EEG Based Motor Imagery Of BCI System, Optik 125 (3) (2014) 1498-1502.
  • [26] Guo, X., Wu, X., Gong, X., and Zhang, L. (Year) Envelope Detection Based On Online ICA Algorithm And Its Application To Motor Imagery Classification. in:2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), (2013) 1058-1061 IEEE.
  • [27] Chandrashekar, G. and Sahin, F., A Survey On Feature Selection Methods. Computers Electrical Engineering, 40 (1) (2014) 16-28.
  • [28] Aalaei, S., Shahraki, H., Rowhanimanesh, A., and Eslami, S., Feature Selection Using Genetic Algorithm For Breast Cancer Diagnosis: Experiment On Three Different Datasets, Iranian journal of basic medical sciences, 19 (5) (2016) 476.
  • [29] Leardi, R., Boggia, R., and Terrile, M., Genetic Algorithms As A Strategy For Feature Selection, Journal of chemometrics, 6 (5) (1992) 267-281.
  • [30] Sakri, S. B., Rashid, N. B. A., and Zain, Z. M., Particle Swarm Optimization Feature Selection For Breast Cancer Recurrence Prediction, IEEE Access, 6 (2018) 29637-29647.
  • [31] Wang, P., Xue, B., Liang, J., and Zhang, M., Multiobjective Differential Evolution For Feature Selection In Classification, IEEE Transactions on Cybernetics, (2021)
  • [32] Hancer, E., Xue, B., Karaboga, D., and Zhang, M., A Binary ABC Algorithm Based On Advanced Similarity Scheme For Feature Selection, Applied Soft Computing, 36 (2015) 334-348.
  • [33] Rostami, O. and Kaveh, M., Optimal Feature Selection For SAR Image Classification Using Biogeography-Based Optimization (BBO), Artificial Bee Colony (ABC) And Support Vector Machine (SVM): A Combined Approach Of Optimization And Machine Learning. Computational Geosciences, 25 (2021) 911-930.
  • [34] Holland, J. H. (1975) Adaptation In Natural And Artificial Systems The University of Michigan Press.
  • [35] Talbi, E.-G., Metaheuristics: From Design To Implementation, John Wiley & Sons, 2009.
  • [36] Yang, X.-S., Engineering Optimization: An Introduction with Metaheuristic Applications, John Wiley & Sons,(2010).
  • [37] Tan, M. S., Tan, J. W., Chang, S.-W., Yap, H. J., Kareem, S. A., and Zain, R. B., A Genetic Programming Approach to Oral Cancer Prognosis, PeerJ, 4 (2016).
  • [38] Sharma, A., Kulshrestha, S., and Daniel, S. B., Machine Learning Approaches for Cancer Detection, International Journal of Engineering and Manufacturing (IJEM), 8(2) (2018) 45–55.
  • [39] Sidey-Gibbons, J. A. M., and Sidey-Gibbons, C. J., Machine Learning in Medicine: A Practical Introduction, BMC Medical Research Methodology, 19(1) (2019) 64.
Year 2025, Volume: 46 Issue: 2, 369 - 376, 30.06.2025
https://doi.org/10.17776/csj.1443598

Abstract

References

  • [1] Lacoviello, L., Bonaccio, M., de Gaetano, G., and Donati, M. B. Epidemiology Of Breast Cancer, A Paradigm Of The “Common Soil” Hypothesis, Seminars In Cancer Biology, (2021) 4-10.
  • [2] Siegel, R. L., Miller, K. D., and Jemal, A., Cancer Statistics, CA: A Cancer Journal For Clinicians, 69 (1) (2019), 7-34.
  • [3] WHO. Available at https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Retrieved March 10,2023.
  • [4] Mridha, M. F., Hamid, M. A., Monowar, M. M., Keya, A. J., Ohi, A. Q., Islam, M. R., and Kim, J.-M., A Comprehensive Survey On Deep-Learning-Based Breast Cancer Diagnosis, Cancers, 13 (23) (2021) 6116.
  • [5] Mehrotra, D., Basics Of Artificial Intelligence & Machine Learning, Notion Press, (2019).
  • [6] Segal, M. R., Machine Learning Benchmarks And Random Forest Regression, (2004).
  • [7] Blum, A. L. and Langley, P., Selection Of Relevant Features And Examples In Machine Learning, Artificial Intelligence, 97 (1-2) (1997) 245-271.
  • [8] Chen, R.-C., Dewi, C., Huang, S.-W., and Caraka, R. E., Selecting Critical Features For Data Classification Based On Machine Learning Methods, Journal of Big Data, 7 (1) (2020) 52.
  • [9] Ye, Z., Xu, Y., He, Q., Wang, M., Bai, W., and Xiao, H., Feature Selection Based on Adaptive Particle Swarm Optimization With Leadership Learning, Computational Intelligence Neuroscience, (2022).
  • [10] Ghosh, A., Datta, A., and Ghosh, S., Self-Adaptive Differential Evolution For Feature Selection In Hyperspectral Image Data, Applied Soft Computing, 13 (4) (2013) 1969-1977.
  • [11] Zhang, L., Mistry, K., Lim, C. P., and Neoh, S. C., Feature Selection Using Firefly Optimization For Classification And Regression Models, Decision Support Systems, 106 (2018) 64-85.
  • [12] Baig, M. Z., Aslam, N., Shum, H. P., and Zhang, L., Differential Evolution Algorithm As A Tool For Optimal Feature Subset Selection In Motor Imagery EEG, Expert Systems With Applications, 90 (2017) 184-195.
  • [13] Sindhu, R., Ngadiran, R., Yacob, Y. M., Zahri, N. A. H., and Hariharan, M., Sine–Cosine Algorithm For Feature Selection With Elitism Strategy And New Updating Mechanism, Neural Computing Applications, 28 (2017) 2947-2958.
  • [14] Mafarja, M. and Mirjalili, S., Whale Optimization Approaches for Wrapper Feature Selection, Applied Soft Computing, 62 (2018) 441-453.
  • [15] Abdel-Basset, M., El-Shahat, D., El-Henawy, I., De Albuquerque, V. H. C., and Mirjalili, S., A New Fusion Of Grey Wolf Optimizer Algorithm With A Two-Phase Mutation For Feature Selection, Expert Systems with Applications, 139 (2020) 112824.
  • [16] Sehgal, A., Sehgal, M., La, H. M., and Bebis, G., Deep Learning Hyperparameter Optimization for Breast Mass Detection in Mammograms, International Symposium on Visual Computing, Cham: Springer Nature Switzerland, (2022) 270–283.
  • [17] Yaqoob, A., Verma, N. K., Aziz, R. M., and Shah, M. A., RNA-Seq Analysis for Breast Cancer Detection: A Study on Paired Tissue Samples Using Hybrid Optimization and Deep Learning Techniques, Journal of Cancer Research and Clinical Oncology, 150(10) (2024) 455.
  • [18] Boumaraf, S., Liu, X., Ferkous, C., and Ma, X., A New Computer‐Aided Diagnosis System with Modified Genetic Feature Selection for BI‐RADS Classification of Breast Masses in Mammograms, BioMed Research International, 2020(1) (2020) 7695207.
  • [19] UCI, Available at https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic Retrieved January 28, 2024.
  • [20] Maalouf, M. and Strategies, Logistic Regression in Data Analysis: An Overview, International Journal of Data Analysis Techniques, 3 (3) (2011) 281-299.
  • [21] ZhuParris, A., de Goede, A. A., Yocarini, I. E., Kraaij, W., Groeneveld, G. J., and Doll, R. J., Machine Learning Techniques for Developing Remotely Monitored Central Nervous System Biomarkers Using Wearable Sensors: A Narrative Literature Review, Sensors, 23 (11) (2023) 5243.
  • [22] Dillen, A., Lathouwers, E., Miladinović, A., Marusic, U., Ghaffari, F., Romain, O., Meeusen, R., and De Pauw, K., A Data-Driven Machine Learning Approach For Brain-Computer Interfaces Targeting Lower Limb Neuroprosthetics, Frontiers In Human Neuroscience, 16 (2022) 949224.
  • [23] Awadallah, M. A., Abu-Doush, I., Al-Betar, M. A., and Braik, M. S. Metaheuristics For Optimizing Weights, Neural Networks In Comprehensive Metaheuristics, Elsevier, (2023) 359-377.
  • [24] Fawagreh, K., Gaber, M. M., and Elyan, E., Random Forests: From Early Developments To Recent Advancements, Systems Science Control Engineering: An Open Access Journal, 2 (1) (2014) 602-609.
  • [25] Yu, X., Chum, P., and Sim, K.-B., Analysis The Effect Of PCA For Feature Reduction In Non-Stationary EEG Based Motor Imagery Of BCI System, Optik 125 (3) (2014) 1498-1502.
  • [26] Guo, X., Wu, X., Gong, X., and Zhang, L. (Year) Envelope Detection Based On Online ICA Algorithm And Its Application To Motor Imagery Classification. in:2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), (2013) 1058-1061 IEEE.
  • [27] Chandrashekar, G. and Sahin, F., A Survey On Feature Selection Methods. Computers Electrical Engineering, 40 (1) (2014) 16-28.
  • [28] Aalaei, S., Shahraki, H., Rowhanimanesh, A., and Eslami, S., Feature Selection Using Genetic Algorithm For Breast Cancer Diagnosis: Experiment On Three Different Datasets, Iranian journal of basic medical sciences, 19 (5) (2016) 476.
  • [29] Leardi, R., Boggia, R., and Terrile, M., Genetic Algorithms As A Strategy For Feature Selection, Journal of chemometrics, 6 (5) (1992) 267-281.
  • [30] Sakri, S. B., Rashid, N. B. A., and Zain, Z. M., Particle Swarm Optimization Feature Selection For Breast Cancer Recurrence Prediction, IEEE Access, 6 (2018) 29637-29647.
  • [31] Wang, P., Xue, B., Liang, J., and Zhang, M., Multiobjective Differential Evolution For Feature Selection In Classification, IEEE Transactions on Cybernetics, (2021)
  • [32] Hancer, E., Xue, B., Karaboga, D., and Zhang, M., A Binary ABC Algorithm Based On Advanced Similarity Scheme For Feature Selection, Applied Soft Computing, 36 (2015) 334-348.
  • [33] Rostami, O. and Kaveh, M., Optimal Feature Selection For SAR Image Classification Using Biogeography-Based Optimization (BBO), Artificial Bee Colony (ABC) And Support Vector Machine (SVM): A Combined Approach Of Optimization And Machine Learning. Computational Geosciences, 25 (2021) 911-930.
  • [34] Holland, J. H. (1975) Adaptation In Natural And Artificial Systems The University of Michigan Press.
  • [35] Talbi, E.-G., Metaheuristics: From Design To Implementation, John Wiley & Sons, 2009.
  • [36] Yang, X.-S., Engineering Optimization: An Introduction with Metaheuristic Applications, John Wiley & Sons,(2010).
  • [37] Tan, M. S., Tan, J. W., Chang, S.-W., Yap, H. J., Kareem, S. A., and Zain, R. B., A Genetic Programming Approach to Oral Cancer Prognosis, PeerJ, 4 (2016).
  • [38] Sharma, A., Kulshrestha, S., and Daniel, S. B., Machine Learning Approaches for Cancer Detection, International Journal of Engineering and Manufacturing (IJEM), 8(2) (2018) 45–55.
  • [39] Sidey-Gibbons, J. A. M., and Sidey-Gibbons, C. J., Machine Learning in Medicine: A Practical Introduction, BMC Medical Research Methodology, 19(1) (2019) 64.
There are 39 citations in total.

Details

Primary Language English
Subjects Biostatistics, Applied Statistics, Operation
Journal Section Natural Sciences
Authors

Aynur Yonar 0000-0003-1681-9398

Harun Yonar 0000-0003-1574-3993

Öznur Özaltın 0000-0001-9841-1702

Publication Date June 30, 2025
Submission Date February 27, 2024
Acceptance Date June 3, 2025
Published in Issue Year 2025Volume: 46 Issue: 2

Cite

APA Yonar, A., Yonar, H., & Özaltın, Ö. (2025). Enhanced Breast Cancer Risk Classification Through Genetic Algorithm-Based Feature Selection and Machine Learning Techniques. Cumhuriyet Science Journal, 46(2), 369-376. https://doi.org/10.17776/csj.1443598