• 北大核心期刊(《中文核心期刊要目总览》2017版)
  • 中国科技核心期刊(中国科技论文统计源期刊)
  • JST 日本科学技术振兴机构数据库(日)收录期刊

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于CNN与Swin Transformer的新疆荒漠植物识别研究

许春陶 钱育蓉 范迎迎 杜臻宇 邵游朋

许春陶,钱育蓉,范迎迎,等.基于CNN与Swin Transformer的新疆荒漠植物识别研究[J]. 微电子学与计算机,2023,40(6):33-41 doi: 10.19304/J.ISSN1000-7180.2022.0577
引用本文: 许春陶,钱育蓉,范迎迎,等.基于CNN与Swin Transformer的新疆荒漠植物识别研究[J]. 微电子学与计算机,2023,40(6):33-41 doi: 10.19304/J.ISSN1000-7180.2022.0577
XU C T,QIAN Y R,FANG Y Y,et al. Research on the identification of desert plants in Xinjiang based on CNN and Swin Transformer[J]. Microelectronics & Computer,2023,40(6):33-41 doi: 10.19304/J.ISSN1000-7180.2022.0577
Citation: XU C T,QIAN Y R,FANG Y Y,et al. Research on the identification of desert plants in Xinjiang based on CNN and Swin Transformer[J]. Microelectronics & Computer,2023,40(6):33-41 doi: 10.19304/J.ISSN1000-7180.2022.0577

基于CNN与Swin Transformer的新疆荒漠植物识别研究

doi: 10.19304/J.ISSN1000-7180.2022.0577
基金项目: 国家自然科学基金资助项目(61966035);自治区科技厅国际合作项目(2020E01023);国家自然科学基金联合基金——重点项目(U1803261);新疆财经大学校级科研基金项目(2017XYB015)
详细信息
    作者简介:

    许春陶:男,(1996-),硕士研究生. 研究方向为计算机视觉、图像处理

    范迎迎:女,(1993-),博士. 研究方向为遥感图像处理

    杜臻宇:男,(1999-),硕士研究生.研究方向为 计算机视觉、图像处理

    邵游朋:男,(1997-),硕士研究生. 研究方向为网络安全

    通讯作者:

    女,(1980-),博士,教授.研究方向为 网络计算和图像处理. E-mail:qyr@xju.edu.cn

  • 中图分类号: TP183

Research on the identification of desert plants in Xinjiang based on CNN and Swin Transformer

  • 摘要:

    新疆荒漠地区受气候和环境的双重影响易出现干旱灾害和影响农牧业生产,不利于新疆经济的可持续,新疆荒漠植物的识别是各植物研究人员了解植物生长状况的基础,也是生态保护研究和实施治理措施的前提. 同时,新疆荒漠植物图像存在类间相似、图像背景复杂和数据样本不平衡等特点,导致该研究具有一定的难度. 为提高识别准确率、准确定位局部重要特征与综合考虑复杂全局信息,本文提出了一种融合卷积神经网络(CNN)和Swin Transformer网络的植物图像识别方法. 该方法结合了CNN网络擅长提取局部特征和Swin Transformer擅长捕获全局表示的优点,同时在CNN分支中嵌入改进的Convolutional Block Attention Module (CBAM)注意力模块以便充分提取到具有区分度的局部关键特征,并使用Focal Loss损失函数解决数据样本不平衡问题. 通过实验结果表明,提出的融合方法在新疆荒漠植物数据集上相较于单分支网络更能充分提取图像的特征,其识别准确率可达97.99%,且精准率、召回率和F1分数都优于现有的方法. 最后通过可视化分析和混淆矩阵进一步佐证了该方法的有效性.

     

  • 图 1  模型的总体结构

    Figure 1.  The overall structure of model

    图 2  残差单元的结构

    Figure 2.  The structure of residual unit

    图 3  Swim Transformer的分层特征图

    Figure 3.  Hierarchical feature map of Swing Transformer

    图 4  Swim Transformer块

    Figure 4.  Swim Transformer block

    图 5  Dynamic ReLU激活函数

    Figure 5.  Dynamic ReLU activation function

    图 6  Dy-CBAM模块

    Figure 6.  Dy-CBAM Module

    图 7  部分植物数据集

    Figure 7.  Partial plant dataset

    图 8  Cutout数据增强示例

    Figure 8.  Cutout Data augmentation example

    图 9  样本图像的可视化结果

    Figure 9.  Visualization results of sampling images

    图 10  混淆矩阵

    Figure 10.  Confusion matrix

    表  1  不同组件对分类结果的影响

    Table  1.   Influence of different component on classification results

    MethodResNet34Swin-TransformerDADy-CBAMAccuracy/%Precision/%Recall/%F1-Score/%
    92.3891.7791.4891.62
    94.4694.4093.8794.12
    96.7496.8296.3396.57
    97.1096.8396.8596.84
    97.8797.7597.6697.70
    97.7597.7497.5997.66
    ⑦(ours)97.9997.8997.8597.87
    下载: 导出CSV

    表  2  损失函数对结果的影响

    Table  2.   Influence of loss function on the results

    MethodFLAccuracy/%Precision/%Recall/%F1-Score/%
    97.7597.6297.5797.59
    ②(ours)97.9997.8997.8597.87
    下载: 导出CSV

    表  3  不同融合方式对结果的影响

    Table  3.   Influence of different fusion methods on the results

    MethodFuseAccuracy/%Precision/%Recall/%F1-Score/%
    Add97.6897.6297.5397.57
    ②(ours)Concat97.9997.8997.8597.87
    下载: 导出CSV

    表  4  Dy-CBAM不同位置的影响

    Table  4.   Influence of different positions of Dy-CBAM

    MethodABAccuracy/%Precision/%Recall/%F1-Score/%
    97.8797.7597.6697.70
    97.8297.5797.7497.65
    ③(ours)97.9997.8997.8597.87
    97.9197.8197.7497.77
    下载: 导出CSV

    表  5  不同算法在本数据集上的对比

    Table  5.   Comparison of different algorithms on this dataset

    ModelAccuracy/%Precision/%Recall/%F1-Score/%
    VGG1989.8788.9489.0889.01
    ResNet5094.7594.5294.4294.47
    ResNeXt5095.3095.0294.8994.95
    DenseNet12194.1193.4593.4193.43
    DenseNet16994.7994.4394.5094.46
    EfficientNet94.8594.6294.2994.45
    VIT96.4195.9796.0896.02
    BCNN93.0992.8492.5092.67
    Ours97.9997.8997.8596.87
    下载: 导出CSV
  • [1] 李欣玫, 左易灵, 薛子可, 等. 不同荒漠植物根际土壤微生物群落结构特征[J]. 生态学报,2018,38(8):2855-2863. DOI: 10.5846/stxb201704210722.

    LI X M, ZUO Y L, XUE Z K, et al. Structure of microbial communities in rhizosphere of different desert plants[J]. Acta Ecologica Sinica,2018,38(8):2855-2863. DOI: 10.5846/stxb201704210722.
    [2] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision,2004,60(2):91-110. DOI: 10.1023/B:VISI.0000029664.99615.94.
    [3] HE D C, WANG L. Texture unit, texture spectrum, and texture analysis[J]. IEEE Transactions on Geoscience and Remote Sensing,1990,28(4):509-512. DOI: 10.1109/TGRS.1990.572934.
    [4] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005.
    [5] 祁亨年, 寿韬, 金水虎. 基于叶片特征的计算机辅助植物识别模型[J]. 浙江林学院学报,2003,20(3):281-284. DOI: 10.3969/j.issn.2095-0756.2003.03.013.

    QI H N, SHOU T, JIN S H. Leaf characteristics-based computer-aided plant identification model[J]. Journal of Zhejiang Forestry University,2003,20(3):281-284. DOI: 10.3969/j.issn.2095-0756.2003.03.013.
    [6] 陈寅, 周平. 植物叶形状与纹理特征提取研究[J]. 浙江理工大学学报,2013,30(3):394-3983. DOI: 10.3969/j.issn.1673-3851.2013.03.022.

    CHEN Y, ZHOU P. Research on shape and texture feature extraction of plant leaf images[J]. Journal of Zhejiang Sci-Tech University,2013,30(3):394-3983. DOI: 10.3969/j.issn.1673-3851.2013.03.022.
    [7] ZIN I A M, IBRAHIM Z, ISA D, et al. Herbal plant recognition using deep convolutional neural network[J]. Bulletin of Electrical Engineering and Informatics,2020,9(5):2198-2205. DOI: 10.11591/eei.v9i5.2250.
    [8] LEE S H, CHAN C S, WILKIN P, et al. Deep-plant: plant identification with convolutional neural networks[C]//2015 IEEE International Conference on image Processing (ICIP). Quebec City: IEEE, 2015: 452-456.
    [9] SHAH M P, SINGHA S, AWATE S P. Leaf classification using marginalized shape context and shape+texture dual-path deep convolutional neural network[C]//2017 IEEE International Conference on Image Processing (ICIP). Beijing: IEEE, 2017.
    [10] LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1449-1457.
    [11] PENG Z L, HUANG W, GU S Z, et al. Conformer: local features coupling global representations for visual recognition[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 357-366.
    [12] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992-10002.
    [13] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
    [14] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv: 2010.11929, 2020.
    [15] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
    [16] CHEN Y P, DAI X Y, LIU M C, et al. Dynamic ReLU[C]//16th European Conference on Computer Vision. Glasgow: Springer, 2020.
    [17] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007.
    [18] 段士民, 尹林克. 中国常见植物野外识别手册: 荒漠册[M]. 北京: 商务印书馆, 2016.

    DUAN S M, YI L K. Field guide to wild plants of China: desert[M]. Beijing: The Commercial Press, 2016.
    [19] DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[J]. arXiv: 1708.04552, 2017.
    [20] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 618-626.
    [21] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv: 1409.1556, 2014.
    [22] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261-2269.
    [23] TAN M X, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//Proceedings of the 36th International Conference on Machine Learning. Long Beach: PMLR, 2019: 6105-6114.
  • 加载中
图(10) / 表(5)
计量
  • 文章访问数:  20
  • HTML全文浏览量:  17
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-09-18
  • 修回日期:  2022-10-12

目录

    /

    返回文章
    返回