浙江农业学报 ›› 2024, Vol. 36 ›› Issue (4): 952-967.DOI: 10.3969/j.issn.1004-1524.20230621

• 生物系统工程 • 上一篇    下一篇

基于融合非局部操作的YOLOv5s高密度锦鲤鱼苗检测方法

汤永华1(), 石非凡1,*(), 林森2, 张志鹏1, 孟妍君1, 刘兴通1   

  1. 1.沈阳工业大学 信息科学与工程学院,辽宁 沈阳 110870
    2.沈阳理工大学 自动化与电气工程学院,辽宁 沈阳 110159
  • 收稿日期:2023-05-12 出版日期:2024-04-25 发布日期:2024-04-29
  • 作者简介:汤永华(1980—),男,山东寿光人,博士,讲师,主要从事深度学习、图像处理、数字系统研究。E-mail:tangyonghua@sut.edu.cn
  • 通讯作者: *石非凡,E-mail:2328930037@qq.com
  • 基金资助:
    辽宁省机器人联合基金(20180520022);2023年度辽宁省应用基础研究计划(2023JH2/101300237)

YOLOv5s high-density koi fry detection method based on fusion non-local operation

TANG Yonghua1(), SHI Feifan1,*(), LIN Sen2, ZHANG Zhipeng1, MENG Yanjun1, LIU Xingtong1   

  1. 1. School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
    2. School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang 110159, China
  • Received:2023-05-12 Online:2024-04-25 Published:2024-04-29
  • Contact: SHI Feifan

摘要:

针对现有方法在高密度锦鲤鱼苗目标检测任务中适用性差的问题,提出一种基于非局部操作的YOLOv5s(MS-Non-local BIFPN coordinate attention YOLOv5s,NBC-YOLOv5s)目标检测算法。首先,在YOLOv5s的主干网络中,添加多尺度非局部操作算子(multi scale non-local, MS-Non-local),增强模型对高密度锦鲤鱼苗的特征提取能力;其次,在颈部网络使用双向加权特征金字塔结构(bi-directional feature pyramid network, BIFPN)提升模型特征融合效率;最后,在网络的特征融合处,引入坐标注意力机制(coordinate attention, CA),增加模型对图片关键信息的关注度。为验证本文算法的有效性,结合真实渔场环境建立锦鲤鱼苗数据集。实验结果表明,NBC-YOLOv5s的精确率、召回率、平均精度均值(mAP)分别为88.5%、89.7%、93.7%,与YOLOv5s相比,改进后网络较原模型分别提升0.6、9.0、4.4百分点。为验证MS-Non-local对YOLOv5s的性能提升效果,本文对比了卷积注意力(convolutional block attention module, CBAM)、通道注意力(squeeze and excitation, SE)、双层路由注意力(bi-level routing attention, BRA)3种机制。结果表明,MS-Non-local的mAP相较于CBAM、SE、BRA分别提升了2.6、2.1、0.9百分点。并且通过模型拆解,分析了本文方法对不同密度锦鲤鱼苗图像的检测有效性,结果显示,该算法可实现真实场景下对高密度锦鲤鱼苗的检测,能够为筛选高品质锦鲤提供有效技术支撑。

关键词: 锦鲤, 鱼苗检测, 高密度目标, YOLOv5s, 多尺度非局部操作算子

Abstract:

Aiming at the poor applicability of existing methods in the target detection task of high-density koi fry, a Ms-Non-local BIFPN coordinate attention YOLOv5s (NBC-YOLOv5s) target detection algorithm based on non-local operation is proposed. Firstly, in the backbone network of YOLOv5s, a multi-scale non-local operator (MS-Non-local) is added to enhance the feature extraction ability of the model for high-density koi fry. Secondly, the bi-directional feature pyramid network (BIFPN) is used in the neck network to improve the model feature fusion efficiency. Finally, at the feature fusion of the network, the coordinate attention (CA) mechanism is introduced to increase the model’s attention to the key information of the image. In order to verify the effectiveness of the proposed algorithm, a koi fry dataset was established based on the real fishery environment. The experimental results show that the precision, recall rate and mean average precision (mAP) of NBC-YOLOv5s are 88.5%, 89.7% and 93.7%, respectively, which are 0.6, 9.0 and 4.4 percentage points higher than the original model in the improved network compared with YOLOv5s. In order to verify the performance improvement effect of MS-Non-local on YOLOv5s, this paper compares the three mechanisms of convolutional block attention module (CBAM), squeeze and excitation (SE), and bi-level routing attention (BRA). The results showed that the mAP of MS-Non-local increased by 2.6, 2.1 and 0.9 percentage points compared with CBAM, SE and BRA, respectively. Through model disassembly, the effectiveness of the proposed method on the detection of images of koi fry of different densities is analyzed, and it is concluded that the algorithm can realize the detection of high-density koi fry in real scenarios, and can provide effective technical support for screening high-quality koi.

Key words: koi, fry detection, high-density target, YOLOv5s, Multi Scale-Non-local

中图分类号: