Acta Agriculturae Zhejiangensis ›› 2025, Vol. 37 ›› Issue (7): 1521-1532.DOI: 10.3969/j.issn.1004-1524.20240733

• Environmental Science • Previous Articles     Next Articles

Inversion of soil total iron content using random forest model based on multi-spectral transformation and principle compoment analysis

JIANG Zhenlan1(), CHEN Fuxun1, LUO Shuangfei1, LUO Yeqin1, SHA Jinming2,*()   

  1. 1. College of Geography and Oceanography, Minjiang University, Fuzhou 350108, China
    2. School of Geographical Science, Fujian Normal University, Fuzhou 350108, China
  • Received:2024-08-12 Online:2025-07-25 Published:2025-08-20

Abstract:

Typical hyperspectral inversion models for soil total iron content use single spectral variables as input, which neglect the complementarity among spectral variables. Additionally, the redundancy of spectral bands affects the prediction accuracy and generalization ability of models. In the present study, a random forest (RF) model optimized by integrating spectral variables and principal component analysis (PCA) was proposed, and the soil total iron content in Fuzhou City of China was selected as the study object. By incorporating the original reflectance and its 13 mathematical transformations, a combined spectral variable set was constructed. For variable optimization, PCA was employed in conjunction with various variable selection methods, including multiple linear regression (MLR), competitive adaptive reweighted sampling (CARS), genetic algorithm (GA), successive projections algorithm (SPA), and uninformative variable elimination (UVE). Based on the optimized variable set, RF inversion models were established to predict soil total iron content. The results indicated that all the constructed models exhibited excellent predictive capability in the validation set, with determination coefficient (R2) values higher than 0.8 and relative percent difference (RPD) values exceeding 2.8. Among these model, the CARS-PCA-RF, GA-PCA-RF and MLR-PCA-RF models demonstrated strong predictive abilities, with RPD values in the validation set exceeding 3. Notably, the CARS-PCA-RF model performed the best, with an RPD value of 3.292 in the validation set, highlighting the advantages and potential of the variable selection method which combines PCA and CARS in the hyperspectral prediction of soil total iron content. This study proposed a method for predicting soil total iron content based on multiple spectral transformations and PCA-optimized input variables. This approach improved the accuracy and stability of soil total iron content prediction, providing a new solution for the hyperspectral prediction of regional soil total iron content.

Key words: soil total iron content, spectral transformation, random forest, principal component analysis, hyperspectral prediction

CLC Number: