Abstract:
Two kinds of feature influence degree were defined: one was the feature influence degree of document dispersion degree amongst categories, the contribution that was larger was better.Another was the feature influence degree of document dispersion degree in category, the contribution that was smaller was better.And then, the two kinds of feature influence degree ware integrated organically and a new feature selection method was designed.The method can inspect selected feature synthetically so that the feature set that is more representative is obtained.Simulation experiments show that, to a certain extent, the feature selection method is able to improve performance of text categorization.