Classification of the placement success in the undergraduate placement examination according to decision trees with bagging and boosting methods
Abstract
The purpose of this study is to classify the data set which is created by taking students who placed to universities from 81 provinces, in accordance with Undergraduate Placement Examination between the years 2010-2013 in Turkey, with Bagging and Boosting methods which are Ensemble algorithms. The data set which is used in the study was taken from the archives of Turk-Stat. (Turkish Statistical Institute) and OSYM (Assessment, Selection and Placement Center) and MATLAB statistical software program was used. In order to evaluate Bagging and Boosting classification performances better, the success rates of the students were grouped into two groups. According to this, the provinces that were above the average were coded as 1, and the provinces below the average were coded as 0 and dependent variables were created. The Bagging and Boosting ensemble algorithms were run accordingly. In order to evaluate the prediction abilities of the Bagging and Boosting algorithms, the data set was divided into training and testing. For this purpose, while the data between 2010-2012 yearrs were used as training data, the data of the year 2013 were used as testing data. Accuracy, precision, recall and f-measure were used to demonstrate the performance of the methods in the study. As a result, the performance in consequence of "Bagging” and “Boosting” methods were compared. According to this; it was determined that in all performance measure marginally “Boosting” method produced better results than the “Bagging” method.
Keywords
References
- [1] Koyuncugil, A. S., Özgülbaş, N., İMKB'de İşlem Gören KOBİ'lerin güçlü ve zayıf Yönleri : Bir CHAID Karar Ağacı uygulaması. Dokuz Eylül Üniversitesi İİBF Dergisi. 23(1) (2008) 1-22.
- [2] Hand, D.,Manilla, H., Smyth, P., Principles of Data Mining. MIT, USA, (2001) 546
- [3] Augusty, S. M.,Izudheen, S., EnsembleClassifiers A Survey: Evaluation of Ensemble classifiers and data level methods to deal withim balanced data problem in protein- protein interactions. Review of Bionformatics and Biometrics, 2 (1) (2013) 1-9.
- [4] Lee, S. L.A., Kouzani, A. Z., Hu, E. J., Random forest based lung nodule classification aided biclustering. Computerized Medical Imaging and Graphics,34 (2010) 535-542.
- [5] Tartar, A., Kılıç, N., Akan, A., Bagging support vector machine approaches for pulmonary nodule detection. IEEE International Conference on Control, Decision and Information Technologies.Tunisia, (2013) 047-050.
- [6] Zeng, X. D.,Chao, S., Wang, F., 2010. Optimization of Bagging Classifiers Based on SBCB Algorithm. Proceedings of the ninth International Conference on Machine Learning and Cybernetics.11-14 July (2010) Qingdao. 262-267.
- [7] Biggio, B.,Corona, I., Fumera, G., Giacinto, G., Roli, F., Bagging Classifiers for Fighting Poisoning Attacks in Adversarial Classification Tasks. Springer Verlag Berlin Heidelberg, (2011) 350-359.
- [8] Breiman, L., Using iterated bagging to debias regressions. Machine Learnings, 45(3) (2001) 261-277.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Hayrettin Okut
0000-0003-4084-8404
United States
Publication Date
March 22, 2020
Submission Date
March 26, 2019
Acceptance Date
January 21, 2020
Published in Issue
Year 2020 Volume: 41 Number: 1
Cited By
Performance Analysis of Combination of CNN-based Models with Adaboost Algorithm to Diagnose Covid-19 Disease
Journal of Polytechnic
https://doi.org/10.2339/politeknik.901375A comparative study of ensemble methods in the field of education: Bagging and Boosting algorithms
International Journal of Assessment Tools in Education
https://doi.org/10.21449/ijate.1167705