Journal of Statistics Applications & Probability

Author Country (or Countries)

South Africa


The prediction of early childhood numeracy skills development is often studied by determining the learner’s performance in a numeracy test. It is an important study area since numeracy impacts on the learner’s mathematical and statistical abilities later in life. Despite having pros and cons over each other, classification algorithms are often applied in the prediction of early childhood numeracy skills development without justifying the choice of a certain algorithm over others. In this paper, the bi-directional stepwise logistic regression model (SLRM), hierarchical logistic regression model (HLRM), classification and regression tree (CART) and Naïve Bayes (NB) were compared in terms of their ability to predict learners’ numeracy test performance. The algorithms were compared using the true positive rate, true negative rate, specificity, sensitivity, classification error, classification accuracy and the area under the receiver operating characteristic curve (AUROC). The results showed that the HLRM which has been applied by several previous studies on the prediction of numeracy test competence is the best classifier followed by SLRM, CART then NB. The study also confirmed some important predictors of the learner’s performance in a numeracy test some of which were also identified by some previous studies on early childhood numeracy development. Some gaps and recommendations for future research pertaining to the classification algorithms as well as implications for practice were also highlighted. We have made the HLRM scoring algorithm generated from SPSS available as a supplementary material and can be used to classify a set of new learners to either the pass or fail group.

Digital Object Identifier (DOI)