特征选择方法研究综述

施启军; 潘峰; 龙福海; 李娜娜; 苟辉朋; 苏浩辀; 谢雨寒

doi:10.19304/J.ISSN1000-7180.2021.1033

特征选择方法研究综述

A review of feature selection methods

摘要

摘要: 在大数据时代，特征选择是对数据进行预处理的必要环节.特征选择作为一种数据降维技术，其主要目的是从原始数据中选择出对算法最有益的相关特征，降低数据的维度和学习任务的难度，提升模型的效率.现阶段，有关特征选择算法方面的研究已取得阶段性成效，但也面临着重大挑战，其中维度灾难就是特征选择与分类问题所面临的重大挑战.首先，介绍了特征选择算法的基本架构, 依次描述了子集的生成、子集的评估、终止条件、结果验证四个过程；其次，综述了特征选择领域的研究方法及研究成果，对特征选择方法分别依据评价策略、搜索策略、监督信息进行分类阐述，并对这些传统方法进行比较，指出它们的优势和不足；最后对特征选择进行了总结，并对其未来的研究方向进行了展望.

Abstract: In the era of big data, feature selection is a necessary part of data preprocessing. Feature selection is a data dimensionality reduction technology. Its main purpose is to select the most beneficial relevant features for the algorithm from the original data, reduce the dimensionality of the data and the difficulty of learning tasks, and improve the efficiency of the model. At this stage, the research on feature selection algorithms has achieved initial results, but it is also facing major challenges. Among them, the disaster of dimensionality is the major challenge faced by the feature selection and classification problem. First, the basic architecture of the feature selection algorithm is introduced, and the four processes of the generation of the subset, the evaluation of the subset, the termination condition, and the result verification are described in sequence Second, the methods are classified according to evaluation strategy, search strategy, and supervision information respectively, and compare these traditional methods, and point out their advantages and disadvantages. Finally, the feature selection is summarized and its future research direction is prospected.

HTML全文

参考文献(30)

施引文献

资源附件(0)