• 北大核心期刊(《中文核心期刊要目总览》2017版)
  • 中国科技核心期刊(中国科技论文统计源期刊)
  • JST 日本科学技术振兴机构数据库(日)收录期刊

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

修复局部描述子网络的小样本学习方法

汪荣贵 王维 杨娟 薛丽霞

汪荣贵, 王维, 杨娟, 薛丽霞. 修复局部描述子网络的小样本学习方法[J]. 微电子学与计算机, 2022, 39(8): 21-30. doi: 10.19304/J.ISSN1000-7180.2022.0107
引用本文: 汪荣贵, 王维, 杨娟, 薛丽霞. 修复局部描述子网络的小样本学习方法[J]. 微电子学与计算机, 2022, 39(8): 21-30. doi: 10.19304/J.ISSN1000-7180.2022.0107
WANG Ronggui, WANG Wei, YANG Juan, XUE Lixia. Restore local descriptors network for few-shot learning[J]. Microelectronics & Computer, 2022, 39(8): 21-30. doi: 10.19304/J.ISSN1000-7180.2022.0107
Citation: WANG Ronggui, WANG Wei, YANG Juan, XUE Lixia. Restore local descriptors network for few-shot learning[J]. Microelectronics & Computer, 2022, 39(8): 21-30. doi: 10.19304/J.ISSN1000-7180.2022.0107

修复局部描述子网络的小样本学习方法

doi: 10.19304/J.ISSN1000-7180.2022.0107
基金项目: 

科技部重点研发计划 U20B2044

国家自然基金 62106064

详细信息
    作者简介:

    汪荣贵  男,(1966-),博士,教授.研究方向为深度学习、智能视频处理

    杨娟  女,(1983-),博士,讲师.研究方向为智能视频图像处理与分析、神经网络与深度学习技术

    薛丽霞  女,(1976-),博士,副教授.研究方向为神经网络与深度学习技术、智能视频图像处理与分析、嵌入式多媒体技术

    通讯作者:

    王维(通讯作者)  男,(1996-),硕士研究生.研究方向为人工智能图像处理、小样本图像分类. E-mail:wangwei1352@foxmail.com

  • 中图分类号: TP391.4

Restore local descriptors network for few-shot learning

  • 摘要:

    针对现有的基于局部描述子的小样本度量学习方法未能考虑局部描述子之间的关联性以及未充分利用类别的全局特征信息的问题,提出了修复局部描述子网络(RLDN).相邻GCN模块通过利用同张图像内的空间位置关系增强局部描述子之间的联系,修复了部分背景噪声局部描述子.全局特征提取模块通过学习并融合图像的全局特征输出类别的全局描述子,再串接局部描述子对其作进一步修复.此外还引入了三元组损失,将其融入到传统的交叉熵损失中提出了全新的混合损失函数,增大了不同类别的间距,有助于分类器减少错误分类的情况.实验结果表明,与传统的局部描述子方法对比,修复局部描述子网络能降低噪声特征对分类器的干扰,有效提升模型的分类准确率.

     

  • 图 1  修复局部描述子网络架构图(RLDN)

    Figure 1.  The Architecture diagram of the Restore Local Descriptors Network(RLDN)

    图 2  四层卷积模块的基本结构

    Figure 2.  Basic structure of four-layer convolution block

    图 3  局部描述子示意图

    Figure 3.  Diagram of local descriptor

    图 4  相邻连接法构建邻接矩阵

    Figure 4.  Adjacency matrix constructed by adjacent connection method

    图 5  全局特征提取模块

    Figure 5.  Global feature extract module

    图 6  特征图可视化效果对比图(左)原图(中)修复前的局部描述子(右)修复后的特征描述子

    Figure 6.  Feature map visualization effect comparison (left) original image (middle) local descriptor before restoration (right)feature descriptor after restoration

    表  1  MiniImageNet数据集上不同模型的准确率对比

    Table  1.   Accuracy comparison of different models on the MiniImageNet dataset

    Model 年份 类型 5-way 1-shot 5-way 5-shot
    Matching Nets[12] 2016 度量学习 43.56±0.84 55.31±0.73
    Meta-learner LSTM[20] 2016 元学习 43.44±0.77 60.60±0.71
    MAML[21] 2017 元学习 48.70±1.84 63.11±0.92
    Prototypical Nets[22] 2017 度量学习 49.42±0.78 68.20±0.66
    GNN[23] 2017 度量学习 50.33±0.36 66.41±0.63
    Relation Networks[24] 2018 度量学习 50.44±0.82 65.32±0.70
    CovaMNet[6] 2019 局部描述子 51.19±0.76 67.65±0.63
    SAML[7] 2019 局部描述子 52.22±0.71 66.49±0.73
    DN4[5] 2019 局部描述子 51.24±0.74 71.02±0.64
    ADM[9] 2020 局部描述子 54.26±0.63 72.54±0.50
    FEAT[25] 2020 度量学习 55.15±0.70 71.61±0.63
    MELR[26] 2021 元学习 55.35±0.43 72.27±0.35
    IEPT[27] 2021 元学习 56.26±0.45 73.91±0.34
    RLDN-size84 局部描述子 54.18±0.85 72.91±0.60
    RLDN-size224 局部描述子 57.35±0.83 74.41±0.67
    下载: 导出CSV

    表  2  在多个细粒度数据集上不同模型的准确率

    Table  2.   Accuracy of different models on multiple fine-grained datasets

    模型 类型 Stanford Dogs Stanford Cars CUB-200-2010
    1-shot 5-shot 1-shot 5-shot 1-shot 5-shot
    Matching Net[12] 度量学习 35.80±0.99 47.50±1.03 34.80±0.98 44.70±1.03 45.30±1.03 59.50±1.01
    Prototypical Nets[22] 度量学习 37.59±1.00 48.19±1.03 40.90±1.01 52.93±1.03 37.36±1.00 45.28±1.03
    GNN[13] 度量学习 46.98±0.98 62.27±0.95 55.85±0.97 71.25±0.89 51.83±0.98 63.69±0.94
    DN4[5] 局部描述子 45.41±0.76 63.51±0.62 59.84±0.80 88.65±0.44 46.84±0.81 74.92±0.64
    D2N2[10] 局部描述子 47.74±0.83 70.76±0.74 59.46±0.81 86.76±0.53 56.85±0.93 77.78±0.51
    RLDN-size84 局部描述子 52.51±0.91 69.35±0.68 67.27±0.86 90.10±0.43 58.15±0.83 77.15±0.71
    下载: 导出CSV

    表  3  在CUB-200-2011数据集上不同模型的准确率

    Table  3.   Accuracy of different models on the CUB-200-2011 dataset

    模型 5-way 1-shot 5-way 5-shot
    DN4 61.85±0.67 80.50±0.86
    RLDN 64.43±0.75 83.12±0.85
    下载: 导出CSV

    表  4  输入图片尺寸大小对比实验的结果

    Table  4.   Results of comparison experiments with different input image sizes

    输入的图片尺寸 特征图的尺寸 5way-1shot 5way-5shot
    84*84 21*21 54.18±0.85 72.91±0.61
    224*224 14*14 57.35±0.83 74.41±0.67
    下载: 导出CSV

    表  5  输入的图片尺寸分别为84*84和224*224时模块消融实验的结果

    Table  5.   Results of moduleablation experiment when the input image sizes are 84*84 and 224*224 respectively

    损失函数 模块 84*84 224*224
    5way-1shot 5way-5shot 5way-1shot 5way-5shot
    交叉熵损失 k近邻分类 51.24±0.74 71.02±0.64 53.33±0.83 71.40±0.66
    相邻GCN+k近邻分类 52.36±0.76 72.08±0.57 54.82±0.74 73.06±0.67
    全局特征提取模块+k近邻分类 52.04±0.83 71.84±0.64 54.58±0.67 72.74±0.76
    RLDN 53.72±0.75 72.31±0.76 56.90±0.78 73.90±0.75
    混合损失 k近邻分类 51.72±0.72 71.61±0.67 53.94±0.73 72.03±0.56
    相邻GCN+k近邻分类 52.87±0.82 72.51±0.65 55.34±0.85 73.51±0.63
    全局特征提取模块+k近邻分类 52.53±0.75 72.30±0.56 55.04±0.76 73.28±0.57
    RLDN 54.18±0.85 72.91±0.61 57.35±0.83 74.41±0.67
    下载: 导出CSV
  • [1] KIM I, BAEK W, KIM S. Spatially attentive output layer for image classification[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2020: 9530-9539. DOI: 10.1109/CVPR42600.2020.00955.
    [2] 周滨, 刘畅, 周雪珂. 一种基于阴影检测的视频SAR运动目标检测方法[J]. 微电子学与计算机, 2021, 38(12): 24-30. DOI: 10.19304/J.ISSN1000-7180.2021.0329.

    ZHOU B, LIU C, ZHOU X K. A video SAR moving target detection method based on shadow detection[J]. Microelectronics & Computer, 2021, 38(12): 24-30. DOI: 10.19304/J.ISSN1000-7180.2021.0329.
    [3] 熊炜, 童磊, 李利荣, 等. 基于可分离空洞卷积与联合归一化的语义分割算法研究[J]. 微电子学与计算机, 2020, 37(10): 18-23. DOI: 10.19304/j.cnki.issn1000-7180.2020.10.004.

    XIONG W, TONG L, LI L R, et al. Semantic segmentation algorithm based on separable dilated convolution and joint normalization method[J]. Microelectronics & Computer, 2020, 37(10): 18-23. DOI: 10.19304/j.cnki.issn1000-7180.2020.10.004.
    [4] XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 5987-5995. DOI: 10.1109/CVPR.2017.634.
    [5] LI W B, WANG L, XU J L, et al. Revisiting local descriptor based image-to-class measure for few-shot learning[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019: 7253-7260. DOI: 10.1109/CVPR.2019.00743.
    [6] LI W B, XU J L, HUO J, et al. Distribution consistency based covariance metric networks for few-shot learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8642-8649. DOI: 10.1609/aaai.v33i01.33018642.
    [7] HAO F S, HE F X, CHENG J, et al. Collect and select: Semantic alignment metric learning for few-shot learning[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 2019: 8459-8468. DOI: 10.1109/ICCV.2019.00855.
    [8] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv: 1609.02907, 2016. DOI: 10.48550/arXiv.1609.02907
    [9] LI W, WANG L, HUO J, et al. Asymmetric distribution measure for few-shot learning[J]. arXiv preprint arXiv: 2002.00153, 2020. DOI: 10.24963/ijcai.2020/405
    [10] YANG X, NAN X T, SONG B. D2N4: A discriminative deep nearest neighbor neural network for few-shot space target recognition[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(5): 3667-3676. DOI: 10.1109/TGRS.2019.2959838.
    [11] SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: A unified embedding for face recognition and clustering[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 815-823. DOI: 10.1109/CVPR.2015.7298682.
    [12] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016: 3637-3645.
    [13] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7132-7141. DOI: 10.1109/CVPR.2018.00745.
    [14] BOIMAN O, SHECHTMAN E, IRANI M. In defense of nearest-neighbor based image classification[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008: 1-8. DOI: 10.1109/CVPR.2008.4587598.
    [15] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 248-255. DOI: 10.1109/CVPR.2009.5206848.
    [16] DONAHUE J, JIA Y, VINYALS O, et al. Decaf: A deep convolutional activation feature for generic visual recognition[C]//International conference on machine learning. PMLR, 2014: 647-655. DOI: 10.1097/00003643-201406001-00333
    [17] LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear cnn models for fine-grained visual recognition[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1449-1457. DOI: 10.48550/arXiv.1504.07889
    [18] KHOSLA A, JAYADEVAPRAKASH N, YAO B, et al. Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC). Citeseer, 2011, 2(1).
    [19] KRAUSE J, STARK M, DENG J, et al. 3D object representations for fine-grained categorization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops. Sydney, NSW, Australia: IEEE: 2013: 554-561. DOI: 10.1109/ICCVW.2013.77
    [20] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C]//International Conference on Learning Representations. Toulon, France: ICLR, 2017(oral)
    [21] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, NSW, Australia: JMLR. org, 2017: 1126-1135.
    [22] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems. California, USA: NIPS, 2017: 4080-4090. DOI: 10.48550/arXiv.1703.05175
    [23] GARCIA V, BRUNA J. Few-shot learning with graph neural networks[Z]. arXiv preprint arXiv: 1711.04043, 2018.
    [24] SUNG F, YANG Y X, ZHANG L, et al. Learning to compare: Relation network for few-shot learning[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 1199-1208. DOI: 10.1109/CVPR.2018.00131.
    [25] YE H J, HU H, ZHAN D C, et al. Few-shot learning via embedding adaptation with set-to-set functions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 8808-8817. DOI: 10.1109/CVPR42600.2020.00883
    [26] FEI N, LU Z, XIANG T, et al. Melr: Meta-learning via modeling episode-level relationships for few-shot learning[C]//International Conference on Learning Representations. 2020.
    [27] ZHANG M, ZHANG J, LU Z, et al. IEPT: Instance-level and episode-level pretext tasks for few-shot learning[C]//International Conference on Learning Representations. 2020.
  • 加载中
图(6) / 表(5)
计量
  • 文章访问数:  67
  • HTML全文浏览量:  25
  • PDF下载量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-02-18
  • 修回日期:  2022-03-13
  • 网络出版日期:  2022-08-15

目录

    /

    返回文章
    返回