浙江农业学报 ›› 2023, Vol. 35 ›› Issue (8): 1915-1926.DOI: 10.3969/j.issn.1004-1524.20221056

• 生物系统工程 • 上一篇    下一篇

基于流形学习的红松仁脂肪近红外定量检测

仇逊超1,2(), 曹军2,*(), 张怡卓2   

  1. 1.哈尔滨金融学院 计算机系,黑龙江 哈尔滨 150030
    2.东北林业大学 机电工程学院,黑龙江 哈尔滨 150040
  • 收稿日期:2022-07-17 出版日期:2023-08-25 发布日期:2023-08-29
  • 作者简介:仇逊超(1986—),女,黑龙江哈尔滨人,博士研究生,副教授,研究方向为农林产品无损检测、农林业机械化工程。E-mail:ldqiuxunchao@126.com
  • 通讯作者: *曹军,E-mail:ldcaojun1956@163.com
  • 基金资助:
    国家自然科学基金(31270757);黑龙江省省属本科高校基本科研业务费项目(青年学术骨干研究项目)(2021-KYYWF-019);中央高校创新团队与重大项目培育资金项目(E2572016EBC3)

Near-infrared quantitative detection of fat in peeled Korean pine seeds based on manifold learning

QIU Xunchao1,2(), CAO Jun2,*(), ZHANG Yizhuo2   

  1. 1. Department of Computer Engineering, Harbin Finance University, Harbin 150030, China
    2. College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin 150040, China
  • Received:2022-07-17 Online:2023-08-25 Published:2023-08-29

摘要:

红松仁脂肪的定量检测可以作为评价其食用价值和育种价值的重要指标,利用近红外光谱分析技术开展无损检测研究。在变量标准化校正+一阶导数+小波变换对原始光谱进行预处理的基础上,考虑到传统主成分分析降维方法存在对非线性复杂结构不敏感、完全去除特征间线性相关性信息的问题,分别采用流形学习中的等距映射、局部线性嵌入、改进型局部线性嵌入、局部切空间对齐、黑塞特征映射进行非线性降维,以构建的偏最小二乘为定标模型,进一步分别建立岭回归、支持向量回归、极度梯度提升数学模型。结果表明,改进型局部线性嵌入+支持向量回归建立的参数优化模型质量最佳,其降维方法优化参数为:邻域数取30,维度取16,验证集均方差均值为0.646 4,测试集实测值与预测值间的平均相对误差为0.999 2%,可见,该模型可以良好地应用到红松仁脂肪定量检测中。

关键词: 红松仁, 脂肪, 流形学习, 近红外光谱

Abstract:

Quantitative detection of fat in peeled Korean pine seeds is an important indicator of its edible and breeding value. In addition, near-infrared spectroscopy was used to detect nondestructively. Based on the result of standard normalized variate+first derivative+symlet4 (SNV+1st-Der+Sym4) pretreatment method, considering the traditional dimensionality reduction, principal components analysis (PCA) has some problems, such as insensitive to nonlinear complex structures and removing the linear correlation information between features completely. Isometric mapping (Isomap), locally linear embedding (LLE), modified locally linear embedding (MLLE), local tangent space alignment (LTSA) and Hessian based locally linear embedding (HLLE) were used to reduce dimensions separately. Taking the model which was established by partial least square (PLS) as the calibration model. Furthermore, ridge regression (Ridge), support vector regression (SVR) and extreme gradient boosting (XGBoost) were adopted to establish mathematical models, respectively. As shown by the results, the quality of the parameter optimization model established by MLLE+SVR was the best, and the optimized parameters were as follows: neighborhood number (neighbors) was 30 and dimension (components) was 16, and the mean value of mean squared error of validation (mean-MSEV) was 0.646 4, and the mean relative error (MRE) of test set was 0.999 2%. Therefore, the MLLE+SVR model can be well applied to the quantitative detection of fat in peeled Korean pine seeds.

Key words: peeled Korean pine seeds, fat, manifold learning, near-infrared spectroscopy

中图分类号: