Ensemble Learning for Software Defect Prediction: Performance, Practicality and Future Directions
Abstract
Ensemble learning is a leading approach in software defect prediction (SDP), offering improved predictive performance on imbalanced and high-dimensional datasets. Despite growing research interest, persistent gaps remain in model interpretability, generalizability, and reproducibility, limiting its practical adoption. This paper presents a comprehensive analysis of 56 peer-reviewed studies published between 2020 and 2025, spanning both journal and conference venues. Findings show that ensemble methods, especially when combined with sampling, feature selection, or optimisation, consistently outperform single classifiers on important metrics such as F1-score, area under the curve, and Matthew correlation coefficient. Nonetheless, few studies incorporate explainability frameworks, effort-aware evaluation, or cross-project validation. Additionally, most models are static, rely on within-project testing, and depend on legacy datasets such as PROMISE and NASA, which limit external validity. Building on this synthesis, the review highlights future research priorities, including interpretable ensemble architectures, adaptive modelling, dynamic imbalance handling, semantic feature integration, and real-time prediction. Standardised benchmarks, transparent, scalable designs are recommended to bridge the gap between experimental performance and deployment-ready SDP solutions.
Downloads


Copyright (c) 2025 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)