Machine learning model for classifying high school students’ academic performance in mathematics amidst the COVID-19 context
Abstract
There is uncertainty in classifying academic performance due to the large number of variables involved in its measurement within educational institutions. Therefore, this research aimed to develop a machine learning model for the classification of mathematical academic performance in students of a private Regular Educational Institution in the department of Lambayeque, Peru. Using the CRISP-DM methodology, a dataset was obtained through data collection instruments from 711 students in the first, second, and third years of high school during the period from 2019 to 2021 in the context of the COVID-19 pandemic. This dataset, which was validated by expert judgment, contained 34 input variables and 1 output variable that could have three possible values: 1: deficient, 2: improvable, and 3: optimal. The variables most related to the classification of academic performance were the student's self-perception of performance and grades from the previous year in mathematics subjects, as analyzed in the correlation matrix. The model was trained and subsequently evaluated, obtaining 91% accuracy. It was concluded that it is possible to classify academic performance using a machine learning model, with the K-nearest neighbors model being the most appropriate for this research because it works with categorical data and achieves an adequate level of certainty.
Keywords
Machine learning, classification model, academic performance, mathematics, KNN algorithm
DOI: https://doi.org/10.3926/jotse.2945
This work is licensed under a Creative Commons Attribution 4.0 International License
Journal of Technology and Science Education, 2011-2025
Online ISSN: 2013-6374; Print ISSN: 2014-5349; DL: B-2000-2012
Publisher: OmniaScience