PROCEEDINGS IPMU '08
Removing Redundancy from Relevant Features in Text Classification
E. Montañes, I. Díaz, E. F. Combarro, J. Ranilla.
This paper proposes a method for
Feature Selection in Text Categorization.
This task is performed in
two steps. Firstly, an analysis of relevance
is performed and after that
analysis of redundancy is done. For
this purpose, a range of similarity
measures are adopted and converted
into symmetrical ones using several
aggregation operators. This fact assures
that the similarity between two
words are independent of the order
they are considered. Several experiments
over four corpora are performed,
leading to conclude that this
method reaches good results.
PDF full paper |