University dropout rates in Europe pose a significant challenge to achieving the European Union’s target of 45% higher education attainment among 25–34-year-olds by 2030. This pilot study introduces an innovative framework, the Integrated Theory-Driven ML Dropout (ITMD) Model, which integrates advanced Machine Learning (ML) techniques with established educational theories, including Tinto’s Dropout Theory, Expectancy-Value Theory, Self-Determination Theory, and the Hardré and Reeve Model. The UnitelmaSapienza (2021) dataset was employed for this initial study to develop and implement a stacked classifier architecture. The model integrates Random Forest, Support Vector Machine (SVM), and Gradient Boosting at the base level, with eXtreme Gradient Boosting (XGBoost) serving as the meta-learner, creating a robust and efficient predictive framework. The model achieves 93% accuracy, precision and recall rates of 91–95%, and an AUC-ROC of 0.94, demonstrating robust and balanced dropout prediction. This ITMD Model proposed an ensemble-based approach that significantly improves over traditional methods, particularly in the early identification of at-risk students, offering educators and policymakers a powerful tool for targeted interventions. By bridging ML advancements with theoretical foundations, the research presents a novel and impactful solution to address dropout challenges globally and especially for European higher education, supporting both institutional and societal goal.

Enhancing Student Outcomes: Machine Learning Stacked Classifiers in Higher Education

Basta A.
2024-01-01

Abstract

University dropout rates in Europe pose a significant challenge to achieving the European Union’s target of 45% higher education attainment among 25–34-year-olds by 2030. This pilot study introduces an innovative framework, the Integrated Theory-Driven ML Dropout (ITMD) Model, which integrates advanced Machine Learning (ML) techniques with established educational theories, including Tinto’s Dropout Theory, Expectancy-Value Theory, Self-Determination Theory, and the Hardré and Reeve Model. The UnitelmaSapienza (2021) dataset was employed for this initial study to develop and implement a stacked classifier architecture. The model integrates Random Forest, Support Vector Machine (SVM), and Gradient Boosting at the base level, with eXtreme Gradient Boosting (XGBoost) serving as the meta-learner, creating a robust and efficient predictive framework. The model achieves 93% accuracy, precision and recall rates of 91–95%, and an AUC-ROC of 0.94, demonstrating robust and balanced dropout prediction. This ITMD Model proposed an ensemble-based approach that significantly improves over traditional methods, particularly in the early identification of at-risk students, offering educators and policymakers a powerful tool for targeted interventions. By bridging ML advancements with theoretical foundations, the research presents a novel and impactful solution to address dropout challenges globally and especially for European higher education, supporting both institutional and societal goal.
2024
9789464668537
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11389/87415
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact