张海翔,李培培,胡学钢.基于缺失特征的高维多标签学习[J]. 微电子学与计算机,2023,40(2):59-70. doi: 10.19304/J.ISSN1000-7180.2022.0266
引用本文: 张海翔,李培培,胡学钢.基于缺失特征的高维多标签学习[J]. 微电子学与计算机,2023,40(2):59-70. doi: 10.19304/J.ISSN1000-7180.2022.0266
ZHANG H X,LI P P,HU X G. High-dimensional multi-label learning based on missing features[J]. Microelectronics & Computer,2023,40(2):59-70. doi: 10.19304/J.ISSN1000-7180.2022.0266
Citation: ZHANG H X,LI P P,HU X G. High-dimensional multi-label learning based on missing features[J]. Microelectronics & Computer,2023,40(2):59-70. doi: 10.19304/J.ISSN1000-7180.2022.0266

基于缺失特征的高维多标签学习

High-dimensional multi-label learning based on missing features

  • 摘要: 多标签学习主要处理每个样本数据与多个类标签关联问题,实际应用却很难一次性全部取得完整特征信息.已有多标签学习方法解决缺失特征,但高维环境下特征缺失未能考虑,并且现有特征降维方法大多要么直接从单标签特征选择方法转变而来,要么无法充分利用标签信息,因此无法获得多个标签共享的最佳特征选择结果. 基于此提出了一种高维环境下特征缺失多标签学习方法. 首先,通过学习特征相关矩阵获得新的补全特征矩阵,与原有缺失特征矩阵相比更具完整特征信息. 其次,引入信息理论方法提出一个通用全局优化框架,考虑特征相关性、标签相关性和特征冗余,实现高维多标签数据的特征降维. 之后,为提高多标签分类的性能,通过假设如果两个特征强相关,则它们对应参数向量之间的相似性会更大,以此来约束系数矩阵上的特征相关性. 此外通过约束标签输出的标签相关性,以捕获不同标签之间更充分的关系. 大量实验表明,所提方法与其他先进多标签学习方法相比具有竞争力.

     

    Abstract: Multi-label learning mainly deals with the problem that each sample data is associated with multiple class labels, but it is difficult to obtained complete feature information at one time in practical applications.Existing multi-label learning approaches to solve the missing features, but they are not considered the missing features in high-dimensional environments and most of the existing feature dimensional reduction methods are either directly transformed from single-label feature selection methods, or cannot make full use of label information. Thus,they may not be able to get an optimal feature selection result shared by multiple labels.Motivated by this,we propose a missing feature multi-label learning method in a high-dimensional environment.Firstly, a new complementary feature matrix is obtained by learning the feature correlation matrix, which has richer complete feature information than the original missing feature matrix.Secondly, the information theory method is introduced to propose a general global optimization framework, which considers feature correlation, label correlation and feature redundancy, and realizes feature dimensional reduction of high-dimensional multi-label data.After that, to improve the performance of multi-label classification, we constrain feature correlation on coefficient matrix by assuming that if two features are strongly correlated,the similarity between their corresponding parameter vector will be large. we also constrain label correlation on output of labels to capture more sufficient relationships between different labels. Extensive experiments show a competitive performance of proposed method against other state-of the-art multi-label learning approaches.

     

/

返回文章
返回