Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/61346
Title: Contextualizing the current state of research on the use of machine learning for student performance prediction: A systematic literature review
Contributor(s): Alalawi, Khalid (author); Athauda, Rukshan (author); Chiong, Raymond  (author)orcid 
Publication Date: 2023-12
Open Access: Yes
DOI: 10.1002/eng2.12699
Handle Link: https://hdl.handle.net/1959.11/61346
Abstract: 

Today, educational institutions produce large amounts of data with the deployment of learning management systems. These large datasets provide an untapped potential to support and enhance decision-making and operations. In recent times, machine learning (ML) has been applied to develop models utilizing this “big” data to assist in decision-making. This study presents a systematic literature review into the application of ML to predict student performance. A total of 162 research articles from January 2010 to October 2022 were critically reviewed and analyzed by applying Kitchenham’s systematic literature review approach. Our analysis categorized the literature predicting students’ academic performance into two categories: (i) predicting student performance in assessments, courses or programs, and identifying students at-risk of failing their course/program (129 studies); and (ii) predicting student dropout or retention in a course or program (33 studies). Classification is the most commonly used approach for predicting student performance (138 studies), followed by regression (25 studies) and clustering (9 studies). Supervised learning methods are used more often than semi-supervised learning. Five most popular ML algorithms include the Decision Tree, Random Forest, Naïve Bayes, Artificial Neural Network, and Support Vector Machine. Historical records of students’ grades and class performance, academic related data from learning management systems, and students’ demographics are the most common features used for predicting students’ performance. The most common methods used for feature selection are Information Gain-based selection algorithms, Correlation-based feature selection, and Gain Ratio. The general platforms/tools/libraries used in the studies include WEKA, Python, R, Rapid Miner, and MATLAB. We also investigated possible actions considered in the literature to help at-risk students. We only found very few studies that deployed remedial actions and evaluated their impact on students’ performance. In conclusion, ML has shown great potential in the prediction of student performance, but also has many areas of further research.

Publication Type: Journal Article
Source of Publication: Engineering Reports, 5(12), p. 1-25
Publisher: John Wiley & Sons, Inc
Place of Publication: United States of America
ISSN: 2577-8196
Fields of Research (FoR) 2020: 4602 Artificial intelligence
Peer Reviewed: Yes
HERDC Category Description: C1 Refereed Article in a Scholarly Journal
Appears in Collections:Journal Article
School of Science and Technology

Files in This Item:
2 files
File Description SizeFormat 
openpublished/ContextualizingChiong2023JournalArticle.pdfPublished Version2.36 MBAdobe PDF
Download Adobe
View/Open
Show full item record

SCOPUSTM   
Citations

6
checked on Oct 26, 2024
Google Media

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons