Abstract
Breast cancer is one of the most common types of cancer in women, and its early detection significantly improves the survival rate. Although mammography is one of the least invasive and most widely used methods in the diagnostic process, its complexity and subjectivity in medical interpretation present significant challenges. In this article, we propose a new approach that supports the breast cancer diagnosis process by assisting in the classification of mammography images as malignant or benign, or through the BIRADS system. Our proposal consists of two phases. Initially, we implemented the FP-Growth algorithm on patients’ clinical data, analyzing variables such as age and sex to identify frequent patterns. This allows us to explore, group, and visually characterize shared findings and trends among clinical data, which is useful for doctors when creating risk groups or establishing a pre-diagnosis based on the patient’s profile. In this phase, we also prepared the images for training the different models. Subsequently, we combined the strengths of two models through stacking: the Random Forest (RF) model and Convolutional Neural Networks (CNN) with knowledge transfer, to improve image classification and diagnosis. We also explored other methods such as CNN and Support Vector Machine (SVM) to compare the accuracy of the proposed methodology against conventional techniques. The developed models were trained using public datasets: “The Chinese Mammography Database” [2] and “The INbreast database” [3]. The accuracy of the method is evaluated using various classification-related metrics, such as Accuracy, Precision, F1 Score, and Recall. The results show that combining base models using a stacking strategy achieves significantly superior performance compared to individual models, with ideal scores in accuracy, recall, and F1 score using k-fold cross-validation in the meta-model. These excellent results suggest that combining multiple base models more effectively captures the underlying complexities and patterns in the data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
American Cancer Society. Breast cancer facts and figures 2021–2022 (2022). https://www.cancer.org/es/cancer/prevencion-del-riesgo/entender-el-riesgo-de-cancer/cancer-datos-factuales/informacion-sobre-el-cancer-para-mujeres.html
Cui, C., et al.: Chinese mammography database (CMMD): a biopsy-confirmed mammography database online for automatic breast diagnosis. Cancer Imaging Archive (2021). https://doi.org/10.7937/tcia.eqde-4b16
Holeček, M.: InBreast [Conjunto de datos] (2020). https://www.kaggle.com/datasets/martholi/inbreast
Hurtado, R., Guzmán, S., Muñoz, A.: An architecture and a new deep learning method for head and neck cancer prognosis by analyzing serial positron emission tomography images. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) JCC-BD &ET 2023. Communications in Computer and Information Science, vol. 1828, pp. 129–140. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40942-4_10
Huang, M.-L., Lin, T.-Y.: Dataset of breast mammography images with masses. Data Brief 31(105928), 105928 (2020). https://doi.org/10.1016/j.dib.2020.105928
Sanmartín, J., Azuero, P., Hurtado, R.: A modern approach to osteosarcoma tumor identification through integration of FP-growth, transfer learning and stacking model. In: Rocha, Á., Ferrás, C., Hochstetter Diez, J., Diéguez Rebolledo, M. (eds.) ICITS 2024. LNNS, vol. 932, pp. 298–307. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-54235-0_28
Zhang, Y., et al.: Deep learning-based automatic diagnosis of breast cancer on MRI using mask R-CNN for detection followed by ResNet50 for classification. Acad. Radiolo. 30(Supplement 2), S161–S171 (2023). https://doi.org/10.1016/j.acra.2022.12.038. ISSN 1076-6332
Caruana, R., et al.: Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015). https://doi.org/10.1145/2783258.2788613
Huang, Y.: Prediction of breast cancer via deep learning. In: Patnaik, S., Kountchev, R., Tai, Y., Kountcheva, R. (eds.) 3D Imaging—Multidimensional Signal Processing and Deep Learning. Smart Innovation, Systems and Technologies, vol. 349, pp. 87–97. Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-1230-8_8
Novillo, E., Montesdeoca, M., Hurtado, R.: Cutting-edge advanced machine learning model for enhanced breast cancer diagnostics. In: Yang, X.S., Sherratt, S., Dey, N., Joshi, A. (eds.) ICICT 2024. LNNS, vol. 1003, pp. 463–472. Springer, Singapore (2024). https://doi.org/10.1007/978-981-97-3302-6_37
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sanmartín, J., Azuero, P., Hurtado, R. (2025). New Approach to Support the Breast Cancer Diagnosis Process Using Frequent Pattern Growth and Stacking Based on Machine Learning Techniques. In: Julian, V., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2024. IDEAL 2024. Lecture Notes in Computer Science, vol 15347. Springer, Cham. https://doi.org/10.1007/978-3-031-77738-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-77738-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-77737-0
Online ISBN: 978-3-031-77738-7
eBook Packages: Computer ScienceComputer Science (R0)