ENHANCED GRADUATION PREDICTION WITH CATBOOST: A SIX-CLASS CLASSIFICATION SYSTEM FOR ACADEMIC ADVISING

Các tác giả

  • Mai Linh Đoàn Thị Đại học Công Nghiệp Hồ Chí Minh
  • Nhí Võ Văn
  • Quang Nguyễn

Từ khóa:

Academic performance prediction, CatBoost, Educational data mining, Explainable AI, Graduation prediction, Multi-class classification, SMOTE

Tóm tắt

This study presents an enhanced graduation prediction system for Information Systems students, introducing three major innovations over our previous work: (1) refined six-class classification framework with clearer naming and cause-specific labeling (distinguishing graduation quality levels and delay causes); (2) CatBoost algorithm replacing Random Forest, achieving 88% accuracy with superior stability (±1.3% vs ±2.1% standard deviation) and native categorical feature support; and (3) production-ready system architecture integrating role-based access control and comprehensive prediction history logging. Using academic data from 793 students at Industrial University of Ho Chi Minh City (167 more than our initial study), the CatBoost model demonstrated optimal performance across all six outcome categories with per-class F1-scores ≥0.81. SMOTE addressed class imbalance while LIME enhanced model interpretability. Key predictive features include cumulative GPA, TOEIC certification status, and core subject performance. The deployed system enables early identification of at-risk students with an average response time of 1.1 seconds per prediction, supporting proactive academic advising and institutional decision-making.

Đã Xuất bản

09-12-2025

Số

Chuyên mục

Hệ thống thông tin (Information System)