A Comparative Study of Drug Prediction Models using KNN, SVM, and Random Forest
DOI:
https://doi.org/10.51519/journalisi.v7i1.1013Keywords:
Drug classification, Machine Learning, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest, Predictive ModelingAbstract
Accurate drug classification is essential in medical decision-making to ensure patients receive appropriate prescriptions based on their physiological and biochemical characteristics. This study compares the performance of K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest models in predicting drug prescriptions using patient attributes such as age, sex, blood pressure, cholesterol level, and sodium-to-potassium ratio. The dataset, obtained from Kaggle, was preprocessed and split into training and testing sets to evaluate model performance using accuracy as the primary metric. The results indicate that Random Forest outperformed KNN and SVM, achieving a perfect test accuracy of 100%, demonstrating superior generalization and robustness. SVM also performed well, with a test accuracy of 97.50%, while KNN achieved the lowest accuracy of 70%, indicating its limitations in handling complex feature interactions. These findings highlight the effectiveness of ensemble learning methods in medical classification tasks, suggesting that Random Forest is the most suitable model for drug prediction. Furthermore, the potential applications of these findings in clinical settings could enhance treatment outcomes and patient care. Future research should explore feature engineering techniques, larger datasets, and additional machine learning approaches to enhance predictive accuracy and applicability in real-world healthcare settings.
Downloads
References
C. Silpa, B. Sravani, D. Vinay, C. Mounika, and K. Poorvitha, “Drug Recommendation System in Medical Emergencies using Machine Learning,” in Proc. Int. Conf. Innov. Data Commun. Technol. Appl. (ICIDCA), 2023, pp. 107–112, doi: 10.1109/ICIDCA56705.2023.10099607.
C. Chen, “Research on Drug Classification Using Machine Learning Model,” Highlights Sci. Eng. Technol. (EMIS), vol. 2023, p. 350, 2024, doi: 10.54097/nfpj0845.
A. Harry, “Revolutionizing Healthcare: How Machine Learning is Transforming Patient Diagnoses—A Comprehensive Review of AI’s Impact on Medical Diagnosis,” BULLET: J. Multidiscip. Sci., vol. 2, pp. 1259–1266, 2023.
S. Crisafulli, A. Fontana, L. L’Abbate, G. Vitturi, A. Cozzolino, D. Gianfrilli, M. C. De Martino, B. Amico, C. Combi, and G. Trifirò, “Machine learning-based algorithms applied to drug prescriptions and other healthcare services in the Sicilian claims database to identify acromegaly as a model for the earlier diagnosis of rare diseases,” Sci. Rep., vol. 14, no. 1, p. 6186, 2024, doi: 10.1038/s41598-024-56240-w.
F. Aldi, I. Nozomi, and S. Soeheri, “Comparison of Drug Type Classification Performance Using KNN Algorithm,” SinkrOn, vol. 7, no. 3, pp. 1028–1034, Jul. 2022, doi: 10.33395/sinkron.v7i3.11487.
B. A. Badwan, G. Liaropoulos, E. Kyrodimos, D. Skaltsas, A. Tsirigos, and V. G. Gorgoulis, “Machine learning approaches to predict drug efficacy and toxicity in oncology,” Cell Rep. Methods, vol. 3, no. 2, 2023, doi: 10.1016/j.crmeth.2023.100413.
S. Dara, S. Dhamercherla, S. S. Jadav, C. M. Babu, and M. J. Ahsan, “Machine Learning in Drug Discovery: A Review,” Artif. Intell. Rev., vol. 55, no. 3, pp. 1947–1999, Mar. 2022, doi: 10.1007/s10462-021-10058-4.
H. Zhao, J. Zhong, X. Liang, C. Xie, and S. Wang, “Application of machine learning in drug side effect prediction: databases, methods, and challenges,” Front. Comput. Sci., vol. 19, no. 5, p. 195902, 2025, doi: 10.1007/s11704-024-31063-0.
F. Aldi, I. Nozomi, and S. Soeheri, “Comparison of Drug Type Classification Performance Using KNN Algorithm,” SinkrOn, vol. 7, no. 3, pp. 1028–1034, Jul. 2022, doi: 10.33395/sinkron.v7i3.11487.
R. Hoque, M. Billah, A. Debnath, S. M. S. Hossain, and N. B. Sharif, “Heart Disease Prediction using SVM,” Int. J. Sci. Res. Arch., vol. 11, no. 2, pp. 412–420, Mar. 2024, doi: 10.30574/ijsra.2024.11.2.0435.
R. Meenal, P. A. Michael, D. Pamela, and E. Rajasekaran, “Weather prediction using random forest machine learning model,” Indones. J. Electr. Eng. Comput. Sci., vol. 22, no. 2, pp. 1208–1215, May 2021, doi: 10.11591/ijeecs.v22.i2.pp1208-1215.
A. Rajdhan, A. Agarwal, and M. Sai, “Heart Disease Prediction using Machine Learning,” Int. J. Eng. Res. Technol. (IJERT), no. 4, Apr. 2020, doi: 10.17577/IJERTV9IS040614.
R. N. Ndanuko, R. Ibrahim, R. A. Hapsari, E. P. Neale, D. Raubenheimer, and K. E. Charlton, “Association between the urinary sodium to potassium ratio and blood pressure in adults: A systematic review and meta-analysis,” Adv. Nutr., vol. 12, no. 5, pp. 1751–1767, 2021, doi: 10.1093/advances/nmab036.
A. V. Chobanian, G. L. Bakris, H. R. Black, W. C. Cushman, L. A. Green, J. L. Izzo Jr., D. W. Jones, et al., “The seventh report of the joint national committee on prevention, detection, evaluation, and treatment of high blood pressure: The JNC 7 report,” JAMA, vol. 289, no. 19, pp. 2560–2571, 2003.
B. Lepri, J. Staiano, D. Sangokoya, E. Letouzé, and N. Oliver, “The tyranny of data? The bright and dark sides of data-driven decision-making for social good,” in Transparent Data Mining for Big and Small Data, Springer, 2017, pp. 3–24.
A. C. Müller and S. Guido, Introduction to Machine Learning with Python, O’Reilly Media, Inc, 2017.
R. Rodríguez-Pérez and J. Bajorath, “Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery,” J. Comput. Aided Mol. Des., vol. 36, no. 5, pp. 355–362, May 2022, doi: 10.1007/s10822-022-00442-9.
O. A. Montesinos López, A. Montesinos López, and J. Crossa, “Overfitting, Model Tuning, and Evaluation of Prediction Performance,” in Multivariate Statistical Machine Learning Methods for Genomic Prediction, Springer Int. Publ., 2022, pp. 109–139, doi: 10.1007/978-3-030-89010-0_4.
M. Rizki, A. Hermawan, and D. Avianto, “Optimization of Hyperparameter K in K-Nearest Neighbor Using Particle Swarm Optimization,” JUITA: J. Inform., vol. 12, no. 1, pp. 71–79, 2024.
N. Gul, M. Aamir, S. Aldahmani, and Z. Khan, “A Weighted k-Nearest Neighbours Ensemble with added Accuracy and Diversity,” IEEE Access, vol. 10, pp. 125920–125929, Nov. 2022, doi: 10.1109/ACCESS.2022.3225682.
R. Guido, S. Ferrisi, D. Lofaro, and D. Conforti, “An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review,” Inf., vol. 15, no. 4, 2024, doi: 10.3390/info15040235.
J. Yang, Z. Wu, K. Peng, P. N. Okolo, W. Zhang, H. Zhao, and J. Sun, “Parameter selection of Gaussian kernel SVM based on local density of training set,” Inverse Probl. Sci. Eng., vol. 29, no. 4, pp. 536–548, 2021, doi: 10.1080/17415977.2020.1797716.
I. S. Al-Mejibli, J. K. Alwan, and D. H. Abd, “The effect of gamma value on support vector machine performance with different kernels,” Int. J. Electr. Comput. Eng., vol. 10, no. 5, pp. 5497–5506, Oct. 2020, doi: 10.11591/IJECE.V10I5.PP5497-5506.
S. Tangirala, “Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm,” Int. J. Adv. Comput. Sci. Appl., no. 2, pp. 612–619, 2020, doi: 10.14569/ijacsa.2020.0110277.
H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylon. J. Mach. Learn., vol. 2024, pp. 69–79, Jun. 2024, doi: 10.58496/bjml/2024/007.
H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylon. J. Mach. Learn., vol. 2024, pp. 69–79, Jun. 2024, doi: 10.58496/bjml/2024/007.
N. S. Thomas and S. Kaliraj, “An Improved and Optimized Random Forest Based Approach to Predict the Software Faults,” SN Comput. Sci., vol. 5, no. 5, Jun. 2024, doi: 10.1007/s42979-024-02764-x.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














