On incremental wrapper-based attribute selection: experimental analysis of the relevance criteria

Pablo Bermejo, Jose A. Gámez, Jose M. Puerta

This paper deals with the problem of feature subset selection in classification oriented datasets with a (very) large number of attributes. In such datasets the classical wrapper approaches become intractable due to the high number of wrapper evaluations to be carried out. One way to alleviate this problem is to use the so-called filter-wrapper approach, which consists in the construction of a ranking among the predictive attributes by using a filter measure, and then a wrapper approach is used by following the rank. In this way the number of wrapper evaluations is linear with the number of predictive attributes. The main contribution of this paper is the analysis of different relevance criteria used to decide when a new feature must be included or rejected in the selected subset. Experiments have been carried out with three different criteria and different strictness levels, and a statistical analysis is used to draw the conclusions about the best configurations to be used.

PDF full paper