Title: | Contextualizing the current state of research on the use of machine learning for student performance prediction: A systematic literature review |
Contributor(s): | Alalawi, Khalid (author); Athauda, Rukshan (author); Chiong, Raymond (author) |
Publication Date: | 2023-12 |
Open Access: | Yes |
DOI: | 10.1002/eng2.12699 |
Handle Link: | https://hdl.handle.net/1959.11/61346 |
Abstract: | | Today, educational institutions produce large amounts of data with the deployment of learning management systems. These large datasets provide an untapped potential to support and enhance decision-making and operations. In recent times, machine learning (ML) has been applied to develop models utilizing this “big” data to assist in decision-making. This study presents a systematic literature review into the application of ML to predict student performance. A total of 162 research articles from January 2010 to October 2022 were critically reviewed and analyzed by applying Kitchenham’s systematic literature review approach. Our analysis categorized the literature predicting students’ academic performance into two categories: (i) predicting student performance in assessments, courses or programs, and identifying students at-risk of failing their course/program (129 studies); and (ii) predicting student dropout or retention in a course or program (33 studies). Classification is the most commonly used approach for predicting student performance (138 studies), followed by regression (25 studies) and clustering (9 studies). Supervised learning methods are used more often than semi-supervised learning. Five most popular ML algorithms include the Decision Tree, Random Forest, Naïve Bayes, Artificial Neural Network, and Support Vector Machine. Historical records of students’ grades and class performance, academic related data from learning management systems, and students’ demographics are the most common features used for predicting students’ performance. The most common methods used for feature selection are Information Gain-based selection algorithms, Correlation-based feature selection, and Gain Ratio. The general platforms/tools/libraries used in the studies include WEKA, Python, R, Rapid Miner, and MATLAB. We also investigated possible actions considered in the literature to help at-risk students. We only found very few studies that deployed remedial actions and evaluated their impact on students’ performance. In conclusion, ML has shown great potential in the prediction of student performance, but also has many areas of further research.
Publication Type: | Journal Article |
Source of Publication: | Engineering Reports, 5(12), p. 1-25 |
Publisher: | John Wiley & Sons, Inc |
Place of Publication: | United States of America |
ISSN: | 2577-8196 |
Fields of Research (FoR) 2020: | 4602 Artificial intelligence |
Peer Reviewed: | Yes |
HERDC Category Description: | C1 Refereed Article in a Scholarly Journal |
Appears in Collections: | Journal Article School of Science and Technology
|