Paper Title : Identification of Classification Anomalies in Students’ Areas of Specialization using Ensemble Classifiers
ISSN : 2394-2231
Year of Publication : 2021
10.29126/23942231/IJCT-v8i4p3
MLA Style: Anthony Irungu Njina, Mary Asunta Gaceri " Identification of Classification Anomalies in Students’ Areas of Specialization using Ensemble Classifiers " Volume 8 - Issue 4 July-August , 2021 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
APA Style: Anthony Irungu Njina, Mary Asunta Gaceri " Identification of Classification Anomalies in Students’ Areas of Specialization using Ensemble Classifiers " Volume 8 - Issue 4 July-August , 2021 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
Abstract
In this paper, we propose an ensemble of different classifiers and examine the distribution of anomalies in the classification reports of individual model results. The ensemble is constructed using three base classifiers: Multinomial Naïve Bayes (MNB), Support Vector Machines (SVM), and Random Forest (RF). We expect improved accuracy as a result of the combined prediction power of different algorithms, and hypothesize that a concurrence of high error rates within a class is an indication of classification anomaly. Results showed an improved accuracy for classes where individual F1 scores were within the range of the average F1 scores. However, we could not make similar observations for classes with low support, and in the classes identified with possible instances of misclassification. The results obtained from this experiment suggest that an ensemble model with data preprocessing is a more accurate model for predicting students’ subject area combinations
Reference
[1] Araya, D. B., Grolingera, K., ElYamany, H. F., Capretz, M. A., & Bitsuamlak, G. (2017). An ensemble learning framework for anomaly detection in building energy consumption. Energy and Buildings, 144, 191-206. [2] Rayana, S., & Akoglu, L. (2016). Less is more: Building selective anomaly ensembles. ACM Transactions on Knowledge Discovery from Data (TKDD), 10(4), 42. [3] Li, Y., Gao, J., Li, Q., & Fan, W. (2014). Ensemble Learning. In C. C. Aggarwal, Data Classification, Algorithms and Applications (pp. 502 - 528). New York : Chapman and Hall/CRC. [4] Theissler, A. (2017). Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection. Knowledge-Based Systems, 123, 163-173. [5] Magna, G., Casti, P., Jayaraman, S. V., Salmeri, M., Mencattini, A., Martinelli, E., et al. (2016). Identification of mammography anomalies for breast cancer detection by an ensemble of classification models based on artificial immune system. Knowledge-Based Systems, 101, 60-70. [6] Muharemi, F., Logofătu, D., & Leon, F. (2019). Machine learning approaches for anomaly detection of water quality on a real-world data set. Journal of Information and Telecommunication. [7] Satyanarayana, A., & Nuckowski, M. (2016). Data mining using ensemble classifiers for improved prediction of student academic performance. [8] Tang, B., Kay, S., & He, H. (2017). Toward Optimal Feature Selection in Naive Bayes for Text Categorization. IEEE Transactions on Knowledge and Data Engineering. [9] Shen, X., Li, Z., Jiang, Z., & Zhan, Y. (2013). Distributed SVM Classification with Redundant Data Removing. In Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things (iThings/CPSCom). IEEE International Conference on and IEEE Cyber, Physical and Social Computing (pp. 866-870). IEEE. [10] Shen, X., Mu, L., Li, Z., Wu, H., Gou, J., & Chen, X. (2016). Large-scale support vector machine classification with redundant data reduction. Neurocomputing, 172, 189-197. [11] Dong, R., Meng, H., Long, Z., & Zhao, H. (2017). Dimensionality reduction by soft-margin support vector machine. In Agents (ICA), 2017 IEEE International Conference on Agents (ICA) (pp. 154- 156). IEEE. [12] Anagnostopoulos, T. T., & Skourlas, C. (2014). Ensemble Majority Voting Classifier for Speech Emotion Recognition and Prediction. Journal of Systems and Information Technology. [13] Wang, Y. W., & Feng, L. Z. (2018). A new feature selection method for handling redundant information in text classification. Frontiers of Information Technology & Electronic Engineering, 19(2), 221-234
Keywords
—— Misclassification, Anomaly Detection, Ensemble Learning, Students Tracking