skip to main content

Early Detection of Patient Surge Anomalies in Hospitals: A Comparative Analysis of Gradient Boosting, Random Forest, and SVM

*Masparudin Masparudin orcid  -  Department of Software Engineering, Universitas Universal, Pasir Putih, Kel. Sadai, Kec. Bengkong, Kota Batam, Indonesia 29432 | Universitas Universal, Indonesia
Marfuah Marfuah orcid scopus  -  Department of Information System, Universitas Universal, Pasir Putih, Kel. Sadai, Kec. Bengkong, Kota Batam, Indonesia 29432 | Universitas Universal, Indonesia
Abdullah Abdullah orcid  -  Department of Information System, Universitas Islam Indragiri, Parit 1, Jl. Propinsi No. 01, Tembilahan Hulu Indragiri Hilir, Riau, Indonesia 29213, Indonesia
Open Access Copyright (c) 2026 Jurnal Sistem Informasi Bisnis

Citation Format:
Abstract

Unpredictable fluctuations in patient visits often lead to resource unpreparedness and decreased service quality in hospitals. This study aims to develop an early warning system for patient surges across 110 healthcare service units. Unlike conventional approaches utilizing static thresholds, this study proposes a Statistical Anomaly Detection method based on Z-Score for dynamic labeling and applies Synthetic Minority Over-sampling Technique (SMOTE) to address extreme data imbalance. Three classification algorithms—Gradient Boosting Classifier (GBC), Random Forest (RF), and Support Vector Machine (SVM)—were compared using time-series lag features and volatility trends. Experimental results demonstrate that Gradient Boosting outperformed other methods, achieving the highest F1-Score of 37.35% and a Recall of 48.98%, proving its robustness in detecting anomalies within imbalanced data. This study concludes that integrating statistical anomaly-based labeling with ensemble boosting algorithms effectively mitigates noise in heterogeneous hospital visit data, thereby serving as a reliable basis for proactive managerial decision-making.

Note: This article has supplementary file(s).

Fulltext |  Research Instrument
Dataset
Subject
Type Research Instrument
  View (236KB)    Indexing metadata
Email colleagues
Keywords: Data Mining; Gradient Boosting; Patient Surge; Z-Score; Time-Series Prediction.

Article Metrics:

  1. Aggarwal, C. C. (2017). Outlier Analysis. Springer International Publishing. https://doi.org/10.1007/978-3-319-47578-3
  2. Ben-Hur, A., & Weston, J. (2010). A User’s Guide to Support Vector Machines (pp. 223–239). https://doi.org/10.1007/978-1-60327-241-4_13
  3. Blázquez-García, A., Conde, A., Mori, U., & Lozano, J. A. (2022). A Review on Outlier/Anomaly Detection in Time Series Data. ACM Computing Surveys, 54(3), 1–33. https://doi.org/10.1145/3444690
  4. Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  5. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953
  6. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
  7. Fernandez, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. Journal of Artificial Intelligence Research, 61, 863–905. https://doi.org/10.1613/jair.1.11192
  8. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5). https://doi.org/10.1214/aos/1013203451
  9. Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for Multi-Class Classification: an Overview. 1–17. http://arxiv.org/abs/2008.05756
  10. Haibo He, & Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239
  11. Han, J., Pei, J., & Tong, H. (2022). Data Mining: Concepts and Techniques, Fourth Edition. In Data Mining: Concepts and Techniques, Fourth Edition. https://doi.org/10.1016/C2013-0-18660-6
  12. Hastie, T. et. all. (2009). Springer Series in Statistics The Elements of Statistical Learning. The Mathematical Intelligencer, 27(2), 83–85. http://www.springerlink.com/index/D7X7KX6772HQ2135.pdf
  13. Hick, J. L., Barbera, J. A., And, & Kelen, G. D. (2013). Refining Surge Capacity: Conventional, Contingency, and Crisis Capacity. Disaster Medicine and Public Health Preparedness, 3(S1), S59–S67. https://doi.org/https://doi.org/10.1097/DMP.0b013e31819f1ae2
  14. Hoot, N. R., & Aronsky, D. (2020). HHS Public Access. 52(2), 126–136. https://doi.org/10.1016/j.annemergmed.2008.03.014.Systematic
  15. Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting : Principles and Practice Chapter 1 Getting started. 291
  16. Jiang, S., Liu, Q., & Ding, B. (2023). A systematic review of the modelling of patient arrivals in emergency departments. 13(3), 1957–1971. https://doi.org/10.21037/qims-22-268
  17. King, Z., Farrington, J., Li, K., & Crowe, S. (2022). Machine learning for real-time aggregated prediction of hospital admission for emergency patients. 1–12. https://doi.org/10.1038/s41746-022-00649-y
  18. Kohavi, R., & Edu, S. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. 1–7. papers://5e3e5e59-48a2-47c1-b6b1-a778137d3ec1/Paper/p2015
  19. Masini, R. P., Medeiros, M. C., & Mendes, E. F. (2023). Machine learning advances for time series forecasting. Journal of Economic Surveys, 37(1), 76–111. https://doi.org/10.1111/joes.12429
  20. Mullainathan, S., & Obermeyer, Z. (2017). Does Machine Learning Automate Moral Hazard and Error? American Economic Review, 107(5), 476–480. https://doi.org/10.1257/aer.p20171084
  21. Nießl, C., Herrmann, M., Wiedemann, C., Casalicchio, G., & Boulesteix, A. (2022). Over‐optimism in benchmark studies and the multiplicity of design and analysis options when interpreting their results. WIREs Data Mining and Knowledge Discovery, 12(2). https://doi.org/10.1002/widm.1441
  22. Peláez-rodríguez, C., Torres-lópez, R., Pérez-aracil, J., & López-laguna, N. (2024). An explainable machine learning approach for hospital emergency department visits forecasting using continuous training and multi-model regression. Computer Methods and Programs in Biomedicine, 245(January), 108033. https://doi.org/10.1016/j.cmpb.2024.108033
  23. Porto, B. M., & Fogliatto, F. S. (2024). Enhanced forecasting of emergency department patient arrivals using feature engineering approach and machine learning. BMC Medical Informatics and Decision Making, 6. https://doi.org/10.1186/s12911-024-02788-6
  24. Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002
  25. Tello, M., Reich, E. S., Puckey, J., Maff, R., Arce, A. G., Bhattacharya, B. S., & Feijoo, F. (2022). Machine learning based forecast for the prediction of inpatient bed demand. BMC Medical Informatics and Decision Making, 9, 1–13. https://doi.org/10.1186/s12911-022-01787-9
  26. Tuominen, J., Lomio, F., Oksala, N., Palomäki, A., & Peltonen, J. (2022). Forecasting daily emergency department arrivals using high ‑ dimensional multivariate data : a feature selection approach. BMC Medical Informatics and Decision Making, 7, 1–12. https://doi.org/10.1186/s12911-022-01878-7
  27. Tuominen, J., Pulkkinen, E., Peltonen, J., & Kanniainen, J. (2024). Forecasting emergency department occupancy with advanced machine learning models and multivariable input ✩. International Journal of Forecasting, 40(4), 1410–1420. https://doi.org/10.1016/j.ijforecast.2023.12.002
  28. Wilson, G. T. (2016). Time Series Analysis: Forecasting and Control, 5th Edition, by George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel and Greta M. Ljung, 2015. Published by John Wiley and Sons Inc., Hoboken, New Jersey, pp. 712. ISBN: 978‐1‐118‐67502‐1. Journal of Time Series Analysis, 37(5), 709–711. https://doi.org/10.1111/jtsa.12194

Last update:

No citation recorded.

Last update: 2026-05-13 06:22:44

No citation recorded.