Narrating Minimal Data: Rethinking Cohort-Based GPA Prediction in Low-Resource Higher Education Contexts

Berdinata Massang (1), Rolty Glendy Wowiling (2), Allin Junikhah (3), Firmanians Romula Tuerah (4), Andrew Nathanael Ratag (5), Febri Kurnia Manoppo (6)
(1) Institut Agama Kristen Negeri Manado, Indonesia,
(2) Institut Agama Kristen Negeri Manado, Indonesia,
(3) Universitas Islam Negeri Maulana Malik Ibrahim Malang, Indonesia,
(4) Institut Agama Kristen Negeri Manado, Indonesia,
(5) Institut Agama Kristen Negeri Manado, Indonesia,
(6) Hoseo University, Korea, Republic of

Abstract

Background. Student performance prediction has become a major topic in educational data mining and learning analytics. However, most previous studies rely on high-dimensional datasets such as attendance records, course-level grades, and learning management system logs, which are often unavailable in institutions with limited digital infrastructure.


Purpose. This study aims to evaluate the feasibility of predicting student academic performance using minimal institutional data and to establish a practical baseline for machine learning implementation in low-resource higher education contexts. Rather than maximizing predictive accuracy, this research examines the lower boundary of predictive capability when only simple academic variables are available.


Method. A quantitative descriptive–predictive design was applied to 355 student records from the Christian Religious Education Study Program at IAKN Manado, Indonesia. GPA values were categorized into four classes (Poor, Fair, Good, and Very Good). The dataset was split into 75% training and 25% testing subsets, and class imbalance was addressed using SMOTE. Four models were evaluated: Dummy Classifier, Decision Tree, Random Forest, and Neural Network (MLP). Performance was assessed using accuracy and 5-fold cross-validation.


Results. The Dummy Classifier achieved an accuracy of 15.73%, establishing a realistic baseline under balanced class conditions. Decision Tree and Random Forest produced the highest accuracy at 46.06%, while the Neural Network achieved 40.44%. However, cross-validation results remained lower, indicating limited generalization and possible overfitting under minimal-feature conditions.


Conclusion. This study shows that simple institutional data can still provide non-trivial predictive signals, but predictive performance remains moderate. The main contribution of this study lies in positioning minimal-data prediction as a baseline methodological framework for institutions with constrained academic datasets, rather than as a high-accuracy predictive solution.

Full text article

Generated from XML file

References

Abdullah, A., & Chemmangat, K. (2020). A computationally efficient sEMG-based silent speech interface using channel reduction and decision tree-based classification. Procedia Computer Science, 171, 119–127. https://doi.org/10.1016/j.procs.2020.04.013

Abukader, A., Alzubi, A., & Adegboye, O. R. (2025). Intelligent system for student performance prediction: An educational data mining approach using metaheuristic-optimized LightGBM with SHAP-based learning analytics. Applied Sciences, 15(20), 10875. https://doi.org/10.3390/app152010875

Agyemang, E. F., & Mensah, J. A. (2025). Predicting students’ academic performance via machine learning algorithms: An empirical review and practical application. Computer Engineering and Intelligent Systems, 15(1). https://doi.org/10.7176/CEIS/15-1-09

Albreiki, B., Zaki, N., & Alashwal, H. (2021). Student performance prediction using machine learning: A systematic literature review. Education Sciences, 11(9), 552. https://doi.org/10.3390/educsci11090552

Aldowah, H., Al-Samarraie, H., & Fauzy, W. M. (2020). Educational data mining and learning analytics for 21st century higher education: A review and synthesis. Telematics and Informatics, 37, 13–49. https://doi.org/10.1016/j.tele.2019.01.007

Atika, P. D. (2026). A comparative study of machine learning-based student dropout risk prediction. PIKSEL: Penelitian Ilmu Komputer Sistem Embedded and Logic, 14(1), 167–174. https://doi.org/10.33558/piksel.v14i1.12299

Blikstein, P., & Worsley, M. (2016). Multimodal learning analytics and education data mining: Using computational technologies to measure complex learning tasks. Journal of Learning Analytics, 3(2), 220–238. https://doi.org/10.18608/jla.2016.32.11

Boujmiraz, S., Darhmaoui, H., & Drissi El Maliani, A. (2026). Predicting student performance: A comprehensive review of machine learning, deep learning, and explainable AI approaches. Computers and Education: Artificial Intelligence, 10, 100548. https://doi.org/10.1016/j.caeai.2026.100548

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Dabhade, P., Agarwal, R., Alameen, K. P., Fathima, A. T., Sridharan, R., & Gopakumar, G. (2021). Educational data mining for predicting students’ academic performance using machine learning algorithms. Materials Today: Proceedings, 47(15), 5260–5267. https://doi.org/10.1016/j.matpr.2021.05.646

Diponegoro, M. H., Kusumawardani, S. S., & Hidayah, I. (2021). Implementation of deep learning methods in predicting student performance: A systematic literature review. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 10(2), 131–138. https://doi.org/10.22146/jnteti.v10i2.1417

Dutt, A., Ismail, M. A., & Herawan, T. (2021). A systematic review on educational data mining. IEEE Access, 9, 1344–1360. https://doi.org/10.1109/ACCESS.2017.2654247

Gaftandzhieva, S., & Talukder, A. (2022). Exploring online activities to predict the final grade of students. Mathematics, 10(20), 3758. https://doi.org/10.3390/math10203758

Gontzis, A. F. (2018). A predictive analytics framework as a countermeasure for attrition of students. Interactive Learning Environments. https://doi.org/10.1080/10494820.2019.1674884

Guanin-Fajardo, J. H., Guaña-Moya, J., & Casillas, J. (2024). Predicting academic success using machine learning. Data, 9(4), 60. https://doi.org/10.3390/data9040060

Hasan, M. M., Pal, B., & Arifin, S. (2018). Student academic performance prediction. IJACSA, 9(3), 389–395. https://doi.org/10.1109/ICCOINS.2018.8510600

Hashim, A., Akeel, W., & Hamoud, A. K. (2020). Student performance prediction model based on supervised machine learning algorithms. IOP Conference Series: Materials Science and Engineering, 928(3), 032019. https://doi.org/10.1088/1757-899X/928/3/032019

Huang, S., Fang, N., & Xu, Y. (2019). Predicting student academic performance using logs. Computers & Education, 150, 103842. https://doi.org/10.1080/10494820.2019.1636086

Kala, A., & Torkul, O. (2025). Early prediction of students’ performance through deep learning: A systematic and bibliometric literature review. SAUCIS Journal. https://doi.org/10.35377/saucis.1635558

Katarya, R. (2024). A systematic review on predicting the performance of students in higher education in offline mode using machine learning techniques. Wireless Personal Communications. https://doi.org/10.1007/s11277-023-10838-x

Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Open university learning analytics dataset. Scientific Data, 7, 1–9. https://doi.org/10.1038/s41597-020-00639-9

Namoun, A., & Alshanqiti, A. (2021). Predicting student performance using data mining and learning analytics techniques: A systematic literature review. Applied Sciences, 11(1), 237. https://doi.org/10.3390/app11010237

Nguyen, T., Gardner, L., & Sheridan, D. (2020). Data analytics in higher education. British Journal of Educational Technology, 51, 1537–1551. https://doi.org/10.1111/bjet.12910

Pallathadka, H., Wenda, A., Ramirez-Asís, E., Asís-López, M., Flores-Albornoz, J., & Phasinam, K. (2023). Classification and prediction of student performance data using various machine learning algorithms. Materials Today: Proceedings, 80(3), 3782–3785. https://doi.org/10.1016/j.matpr.2021.07.382

Sandra, L., Lumbangaol, F., & Matsuo, T. (2021). Machine learning algorithm to predict student’s performance: A systematic literature review. TEM Journal, 10(4). https://doi.org/10.18421/TEM104-56

Sarker, I. H. (2021). Machine learning: Algorithms, real world applications and research direction. SN Computer Science, 2, 160. https://doi.org/10.1007/s42979-021-00592-x

Shahiri, A. M., Husain, W., & Rashid, N. A. (2015). Review on predicting student performance. Procedia Computer Science, 72, 414–422. https://doi.org/10.1016/j.procs.2015.12.157

Shoukath, T. K., & Midhunchakkravarthy. (2024). A study on predictive modelling of student academic performance using machine learning method. Journal of Information Systems Engineering and Management, 10(1s). https://doi.org/10.52783/jisem.v10i1s.103

Tang, B., Li, S., & Zhao, C. (2024). Deep ensemble learning for prediction. Journal of Intelligence, 12(12), 124. https://doi.org/10.3390/jintelligence12120124

Tiwari, M., & Jain, N. (2024). Student performance prediction using machine learning algorithms. ShodhKosh: Journal of Visual and Performing Arts, 5(6). https://doi.org/10.29121/shodhkosh.v5.i6.2024.4552

Vora, D. R., & Rajamani, K. (2022). A hybrid classification model for prediction of academic performance of students: A big data application. Evolutionary Intelligence, 15, 1083–1096. https://doi.org/10.1007/s12065-019-00303-9

Wang, J., & Yu, Y. (2025). Machine learning approach to student performance prediction of online learning. PLOS ONE, 20(1), e0299018. https://doi.org/10.1371/journal.pone.0299018

Ya?c?, M. (2022). Educational data mining prediction. Smart Learning Environments, 9, 11. https://doi.org/10.1186/s40561-022-00192-z

Zhang, Y., Yun, Y., An, R., Cui, J., Dai, H., & Shang, X. (2021). Educational data mining techniques for student performance prediction: Method review and comparison analysis. Frontiers in Psychology, 12, 698490. https://doi.org/10.3389/fpsyg.2021.698490

Zhou, Y., Xu, B., & Li, Q. (2021). Student performance prediction with ML. IEEE Access, 9, 67849–67859. https://doi.org/10.1109/ACCESS.2021.3076875

Authors

Berdinata Massang
berdinatam@gmail.com (Primary Contact)
Rolty Glendy Wowiling
Allin Junikhah
Firmanians Romula Tuerah
Andrew Nathanael Ratag
Febri Kurnia Manoppo
Massang, B., Wowiling, R. G., Junikhah, A., Tuerah, F. R., Ratag, A. N., & Manoppo, F. K. (2026). Narrating Minimal Data: Rethinking Cohort-Based GPA Prediction in Low-Resource Higher Education Contexts. International Journal of Educational Narratives, 4(1), 309–324. https://doi.org/10.70177/ijen.v4i1.3506

Article Details