ENHANCING BREAST CANCER CLASSIFICATION PERFORMANCE THROUGH FEATURE SELECTION AND MACHINE LEARNING: A COMPREHENSIVE ANALYSIS
Breast cancer, Ensemble Learning, feature selection, Machine Learning, WBCD dataset
Abstract
Effective methods for early diagnosis and treatment are required as the load of chronic illnesses on healthcare systems across the globe rises. As a prevalent and possibly fatal illness, breast cancer need an early diagnosis for effective treatment. With the use of clinical data, machine learning algorithms have shown promise in the precise classification of breast cancer. Building classification models, such as those using kNearest Neighbour, Adaboost, XGBoost, Decision Trees, Random Forests, Support Vector Machines (SVM), and Gradient Boosting, is the main goal of this research. Choosing the right collection of features is a critical step in maximising the performance of machine learning classifiers. Three different feature selection techniques are used in this work: sequential feature selection, information gain-based selection, and correlation-based selection. These techniques provide feature subsets, which are then used to train different machine learning classifiers. The most effective feature subset is determined by measuring its performance. The three top-performing models are included in an ensemble-based method that uses the Max Voting Classifier in addition to the individual models. The research uses samples of breast tissue from 569 patients from the Diagnostic Wisconsin Breast Cancer Database (WBCD), of whom 357 (62.74%) have benign classifications and 212 (37.26%) have malignant diagnoses. Every diagnosis has thirty unique traits linked to it. The findings suggest that by carefully choosing features, machine learning algorithms may be greatly improved for better performance. The Support Vector Machine (SVM) method stands out as being very useful for classifying breast cancer among the models that have been studied. Furthermore, the Recursive Feature Elimination (RFE) approach shows remarkable results when used with Support Vector Machines (SVM) for the classification of breast tumours, particularly in their early stages.
Published
How to Cite
Dr. Karuna S Bhosale, Dr. Mayur Dilip Jakhete, Dr. Umesh Trambakrao Kute , Dr. Maria Nenova, ENHANCING BREAST CANCER CLASSIFICATION PERFORMANCE THROUGH FEATURE SELECTION AND MACHINE LEARNING: A COMPREHENSIVE ANALYSIS, Journal of Advanced Research in Applied Sciences and Engineering Technology Vol. 7, Issue 2 July (2025)