Content area

Abstract

Breast cancer remains a critical global health concern, requiring advanced and reliable diagnostic methods for early detection and effective intervention. This work introduces an integrated ensemble framework that combines multiple dimensionality reduction (DR) techniques, including Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), and Singular Value Decomposition (SVD), with robust machine learning (ML) classifiers for improved breast cancer detection. The publicly available Wisconsin Breast Cancer Dataset (WBCD) was utilized, with rigorous data preprocessing performed to address missing values, anomalies, and class imbalance through stratified sampling and median imputation. To mitigate overfitting and underfitting, dimensionality reduction was coupled with cross-validation and ensemble strategies. The predictive performance of Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Multi-Layer Perceptron (MLP) was systematically evaluated. Experimental results show that SVM consistently achieves a maximum accuracy of 97. 9 % across all applied DR techniques, while MLP and LR also reach 97. 9 % accuracy with PCA and NMF, though MLP exhibits performance variability depending on the selected DR method. The findings provide practical guidance for healthcare practitioners and researchers, supporting the adoption of explainable and scalable AI-driven diagnostic tools. Limitations include the reliance on a single dataset and the need for further validation on larger and more diverse clinical cohorts. Future work will focus on enhancing model interpretability, external validation, and real-world deployment in resource-constrained settings.

Details

1009240
Title
Integrated Ensemble Strategy for Breast Cancer Detection Using Dimensionality Reduction Technique
Volume
14
First page
e31899
Number of pages
18
Publication year
2025
Publication date
2025
Section
Articles
Publisher
Ediciones Universidad de Salamanca
Place of publication
Salamanca
Country of publication
Spain
e-ISSN
22552863
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-10-31
Milestone dates
2025-10-31 (Created); 2024-01-14 (Submitted); 2025-02-27 (Issued); 2025-11-05 (Modified); 2025-06-15 (Accepted)
Publication history
 
 
   First posting date
31 Oct 2025
ProQuest document ID
3282913687
Document URL
https://www.proquest.com/scholarly-journals/integrated-ensemble-strategy-breast-cancer/docview/3282913687/se-2?accountid=208611
Copyright
© 2025. This work is licensed under https://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-15
Database
ProQuest One Academic