Daniel Silva-Palacios, Cèsar Ferri, and María José Ramírez-Quintana

Improving Performance of Multiclass Classification by Inducing Class Hierarchies


In the last decades, one issue that has received a lot of attention in classification problems is how to obtain better classifications. This problem becomes even more complicated when the number of classes is high. In this multiclass scenario, it is assumed that the class labels are independent of each other, and thus, most techniques and methods proposed to improve the performance of the classifiers rely on it. An alternative way to address the multiclass problem is to hierarchically distribute the classes in a collection of multiclass subproblems by reducing the number of classes involved in each local subproblem. In this paper, we propose a new method for inducing a class hierarchy from the confusion matrix of a multiclass classifier. Then, we use the class hierarchy to learn a tree-like hierarchy of classifiers for solving the original multiclass problem in a similar way as the top-down hierarchical classification approach does for working with hierarchical domains. We experimentally evaluate the proposal on a collection of multiclass datasets showing that, in general, the generated hierarchies not only outperforms the original (flat) classification but also hierarchical approaches based on other alternative ways of constructing the class hierarchy.