蔡亮红,丁建丽(1. 新疆大学资源与环境科学学院,乌鲁木齐 830046;2. 新疆大学绿洲生态教育部重点实验室,乌鲁木齐 830046)
小波变换耦合CARS算法提高土壤水分含量高光谱反演精度
蔡亮红,丁建丽※
(1. 新疆大学资源与环境科学学院,乌鲁木齐 830046;2. 新疆大学绿洲生态教育部重点实验室,乌鲁木齐 830046)
为实现干旱地区土壤水分含量(soil moisture content,SMC)的快速监测,该文以渭干河-库车河绿洲为靶区,采用小波变换(wavelet transform,WT)对反射光谱进行1~8层小波分解,通过相关性分析确定最大分解层数,再通过竞争性自适应重加权(competitive adaptive reweighted sampling,CARS)滤除冗余变量,筛选出与SMC相关性较好的波长变量,并叠加各层特征光谱的优选波长变量作为最优变量集,用偏最小二乘回归(partial least squares regression,PLSR)构建土壤水分含量预测模型并进行分析。结果显示:1)小波分解过程中,土壤反射率与SMC的相关性不断增强,到小波变换第6层分解(L6)处达到最高,因此小波变换最大分解层数为6层分解;2)通过对土样进行WT-CARS耦合算法筛选出变量,得出的最优变量集包括400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm区域之间共131个波长变量;3)相对于全波段预测模型,各层特征光谱的CARS优选变量预测模型的精度均高,并且基于最优变量集的预测模型的精度最高,该模型的建模集均方根误差0.021、建模集决定系数0.721、预测集均方根误差0.028、预测集决定系数0.924、相对分析误差2.607。说明WT-CARS耦合算法使其在建立模型时尽可能少地损失光谱细节、较为彻底的去除噪声,同时还能对无信息变量进行有效去除,为该研究区SMC的预测提供新的思路。
土壤;含水率;光谱分析;小波变换;竞争适应重加权采样算法;变量优选
土壤水分含量(soil moisture content,SMC)是土壤系统中物质和能量循环的载体,对土壤特性、植被生长分布以及区域生态系统有着重要的影响[1-2]。然而,土壤水分极易受环境的影响,并且SMC传统的监测方法由于费时费力,难以实现田间实时观测,也难以满足实施精准农业管理对土壤水分监测的需求[3],因此,SMC的监测需要一种高效、精准的方法。近年来,高光谱遥感技术以其大面积、非接触、时效性等优势,在SMC的监测研究中得到重视[4]。然而,通过高光谱技术所获取的土壤光谱原始数据存在明显的光谱噪声和严重的散射现象,土壤高光谱中必然存在与SMC不相关的噪声[5],将会增加SMC光谱信息的探测难度。因此,尽可能少地损失光谱细节、较为彻底的去除噪声成为光谱分析建模过程的关键环节。
目前比较成熟地光谱去噪方法包括Savitzky-Golay滤波、中值运算、移动平均等,但这些方法对于白噪声,特别是随机和低频的信号,则难以去除噪声的同时又不影响有用信号[6]。作为一种新型的去噪处理技术,小波变换已成功应用于高光谱数据处理[7],并随着小波变换等新算法的不断完善,小波变换逐渐被用于土壤属性的估测中[8]。廖钦洪等[9]对北京顺义地区64个土壤样本高光谱曲线进行小波分析,其有机质的反演精度高达75%;张锐等[6]的研究表明第6层分解与重构更能精确地描述土壤有机质特性;Zheng等[10]在第8层分解与重构的基础上建立了土壤各属性特征光谱。上述研究均体现了小波变换通过频域分析聚焦光谱细节的优越性能,但由于高光谱数据量大,导致光谱重构后仍存在大部分冗余噪声。因此,在定量分析高光谱数据时变量优选成为模型构建的关键,目前常用的变量优选方法主要包括遗传算法(genetic algorithms,GA)、连续投影算法(successive projections algorithm,SPA)、蒙特卡罗无信息变量消除(Monte Carlo-uninformative variable elimination,MCUVE)和竞争性自适应重加权算法(competitive adaptive reweighted sampling,CARS)等[11]。于雷等[12]利用CARS算法在预测SMC时取得较好结果;温珍才等[13]和于霜等[14]通过比较上述4种方法,得出基于CARS算法的模型最优。这表明CARS算法的优势在于可以从高维数据中优选变量,克服组合爆炸问题,提高模型的预测能力[15-16]。
目前,小波变换多用于土壤有机质的研究处理[17],而在SMC估算中的应用尚有待进一步探究。本研究以渭库-绿洲土壤样品为研究对象,基于小波变换对反射光谱进行分解,并结合CARS算法构建出SMC最优变量子集,以期尽可能少地损失光谱细节、较为彻底的去除噪声,同时对无信息变量进行有效去除,最终构建基于最优变量集的PLSR预测模型,为土壤水分等研究及当地精准农业提供科学支撑和参考。
1.1 土壤样本
以新疆南部、塔里木盆地中北部的渭-库绿洲(41°08′~41°55′N、81°06′~83°37′E)为研究区,根据研究区特点,共布设39个样点(图1),并利用GPS记录样点位置,以便用于验证。各样点采用5点混合法采集土样,深度0~20 cm,各采样点采集2份样本(一份通过铝盒带回、另一份用塑料袋带回),带回实验室后,对铝盒中的样品进行室内烘干法(将铝盒中的样品置于105 ℃恒温箱烘干48 h)获得相应的土壤水分含量,另一份样本在室内自然风干、研磨并用2 mm孔筛筛过后获取高光谱数据。
图1 野外样点分布Fig.1 Field sample distribution
使用美国ASD(Analytical Spectral Devices)公司生产的FieldSpec3型光谱仪,在暗室中采集光谱数据,其波长范围为350~2 500 nm,350~1 000 nm波长范围采样间隔为1.4 nm,1 000~2 500 nm 范围采样间隔为2 nm,重采样间隔1 nm。把通过2 mm筛的土样将黑色器皿(直径11 cm,深1.4 cm)装满,并且将其表面刮平。在暗室中所用光源为50 W的卤素灯,光源与试验样品之间相隔50 cm,卤素灯天顶角为15°,光谱仪探头与样品之间的相距10 cm。每次测量前均用漫反射标准参考板定标。本试验各土样采集10条光谱曲线,其算术平均值为该土样的光谱数据。
1.2 小波变换
小波变换继承傅里叶分析的优势,并克服了傅里叶分析不能对局部信号的局部频谱特征进行分析的缺点[18],通过对小波母函数的缩放和平移,将信号分解为不同子频带的时频分量,实现对原始信号特定频率特征的更好观察[19],被称为时-频分析的显微镜。
小波变换为一个有限长序列和一个离散小波母函数的内基[20],其表达式如下:
式中Wf(j,k)是小波分析结果,f(n)是信号序列,Ψj,k(n)是小波母函数,(n)是Ψj,k(n)的共轭。与傅里叶变换不同是,它得到的是信号不同子频带在空域上的表现[21]。按照这一理论,小波分解的每一层子频带可表示为原始光谱某一频率的吸收特征,而相应高频信号则被小波滤波器所去除。
根据于雷等[12]研究结论,本研究选择db4小波母函数,并对原始光谱进行1~8层小波变换并构建各层特征光谱,分别用L1~L8表征。
1.3 竞争适应重加权采样
CARS方法模仿达尔文生物进化理论中的“适者生存”原则,借助自适应重加权采样技术(adaptive reweighted sampling,APS)和指数衰减函数(exponentially decreasing function,EDP)优选出PLSR模型中回归系数绝对值大的波长变量,并通过十折交互检验优选出交互验证均方根误差(RMSECV)最小的变量子集,确定为最优变量子集。它可筛选出对土壤属性较敏感的波长变量,并可以解决变量筛选时的组合爆炸问题,对高维数据比较适用[22]。
1.4 PLSR模型建立与验证
PLSR集成了主成分分析、典型相关分析和普通多元线性回归3种方法的优点,它克服了自变量之间的多重线性相关和样本数量小于波长变量的问题,使构建的模型更稳定,有助于多元数据统计分析[23-24]。
本研究选择下列参数来评估模型的精度,包括建模集决定系数(determination coefficients of cablibration,R2)、
c验证集决定系数(predicting determination of cablibration,R2)、建模集均方根误差(root mean square error of
pcalibration, RMSEC)、预测均方根误差(root mean square error of prediction, RMSEP)以及相对分析误差(residual predictive deviation,RPD)。R2越大,模型精度越高;RMSEC与RMSEP表示模型的精确性,其值的大小与模型精度成反比。另外,RPD≥2时,模型预测效果较好,1.4≤RPD<2时,模型预测效果一般,当RPD<1.4时,模型无预测能力[25]。
2.1 样本土壤含水量状况
由SMC的描述性统计特征(表1)可见,建模集和验证集所对应的SMC的均值分别为14.59%、14.84%,而所有土壤样本的SMC均值为14.66%,变异系数(coefficient of variation,CV)为38.89%,属于中等变异,介于建模集和验证集之间。
表1 土壤样品土壤水分含量(SMC)统计特征Table 1 Statistical characteristics of soil moisture content (SMC) of soil samples
2.2 小波变换及最大分解层
本研究在MatlabR2012a中以db4小波母函数对原始光谱数据进行8层小波分解,然后对分解后的每一层小波系数分别进行小波重构,得到各层的特征光谱,分别用L1~L8表征。
如图2所示。图2a(L0为未经过小波变换的原始光谱)中,土壤在1 400、1 900 nm周围存在显著的水分吸收峰,而450、2 200 nm周围较为微弱。L1噪声较多,这是由于原始反射率噪声传递导致,在350~400 nm处较为明显,体现为该范围内的“小毛刺”;随着分解的进行,高频信号被进一步去除,噪声传递现象越来越弱,到L5时噪声很少;由于光谱细节被不断去除,导致光谱曲线逐渐趋于平滑,使得某些表征土壤水分的吸收峰消失,例如在L6中1 400、1 900 nm处还存在显著吸收峰,而在L7中几乎不能表现出来。
进行CARS算法的前提是确定合适的小波分解层。根据各层特征光谱与SMC的相关分析(表2)。L1特征光谱与SMC之间的相关性通过显著性为0.01(阈值为±0.408)的波段数为393个,随着分解层数的增加,相应的特征光谱与SMC的显著性波段数逐渐增加,到L6时达到最多,为602个,并且在L4处达到最大正相关,为0.619,但随着分解层数进一步的增加,L7及以后特征光谱的显著性波段数快速减少,同时最大相关性也快速降低。总体来说L6处特征光谱不仅能去噪,还尽量保存原始光谱信息。因此,本研究确定最大分解层数为6层,并在L1~L6的基础上进一步分析。
图2 小波变换1~8层重构光谱Fig.2 Reconstruction spectra under 1-8 wavelet level
2.3 不同分解层的CARS优选变量子集
在本研究的CARS变量优选中,将蒙特卡罗采样次数设定为50,对采样次数进行反复迭代,通过对比各次采样的RMSECV值,当其值最小时,相应采样次数的变量被筛选为优选变量子集。考虑到篇幅,只分析L1特征光谱的变量优选过程。因为指数衰减函数(exponentially decreasing function,EDP)的存在,导致相应优选变量的数量随迭代次数的增加呈指数减少(图3a,图3b整体上表现出随采样次数的不断迭代,RMSECV值先减后升。1~28次迭代中,RMSECV值逐渐降低,表明在L1特征光谱中与SMC无关的大量信息或噪声被去除,在28次采样之后,RMSECV值慢慢回升,这是因为对SMC较敏感的关键变量被不断去除所致。图3c中28次采样次数时RMSECV最小,图中各线表示随着运行次数的增加各波长变量回归系数的变化趋势。由图3可知,第28次采样的RMSECV值最小,相应光谱变量为优选变量集,该子集包含23个光谱变量。L1~L6特征光谱的RMSECV最小值和相应采样次数及优选变量集见表3。
表2 SMC与各层特征光谱相关分析Table 2 Correlation analysis between SMC and spectra from wavelet analysis in each level
图3 CARS方法变量筛选过程Fig.3 Variable filtering process by competitive adaptive reweighted sampling(CARS)
2.4 适用于SMC土样的最优变量集
在小波变换的基础上采用CARS方法对土样不同分解层进行变量优选,得到各个分解层的优选变量的分布状况(图4)。各层特征光谱的优选变量集大致分布在水分吸收峰(450、1 400、1 900、2 200 nm)周围。由于随着分解层数的增加,一些反应土壤属性的信息也随之消失,每层特征光谱只能表征土壤的部分属性,故将各层特征光谱得到的优选变量进行叠加,得到400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm区域之间共131个波长变量,并作为最优变量集VWT-CARS。在所筛选的最优变量集中有相当一部分位于1 800~2 400 nm范围内,该波长范围的土壤光谱特征主要表现为Al-OH、C-H、O-H、C=O基团的基频振动以及合频和倍频振动吸收[26-27]。
表3 L1~L6各层特征光谱变量优选结果Table 3 Optimal variables of spectral characteristics of the levels of L1~L6
图4 L1~L6各层特征光谱的优选变量分布Fig.4 Distribution of optimal variables based on CARS for characteristic spectrum of L1-L6
2.5 基于优选变量PLSR模型的建立与验证
以WT-CARS耦合算法筛选的优选变量为SMC预测模型的自变量,SMC为因变量,构建SMC预测模型(简称L(i)-CARS-PLSR模型,i=1、2、3、4、5、6),为了更好的突出变量优选的优势,引入全波段(L0)的PLSR模型进行比较,同时构建基于VWT-CARS的SMC预测模型,并且根据表4中各种参数来分析各层特征光谱模型的精度。
通过综合分析表4中各模型的精度可知,通过WT-CARS耦合算法,SMC预测精度得到提高。其中基于最优变量集VWT-CARS所构建的模型精度最高,说明本研究所选的最优变量集能够更好地预测研究区土样SMC,其RMSEC=0.021、=0.721、RMSEP=0.028、=0.924、RPD=2.607。总体来说WT-CARS耦合算法所构建模型精度比全波段所建立模型均高,并且稳定性较好。利用WT-CARS耦合算法能够减少建模变量数,并且提高模型精度,说明两者的耦合可作为一种变量优选的有效方法。在本研究的最佳预测模型中,从2 151个波段中筛选出131个波段进行建模,这大大的压缩了建模时间,提高模型精度,为该区域用土壤高光谱反射率反演其他土壤属性信息时关键波段的筛选提供参考。
表4 土壤水分含量预测结果Table 4 Results of estimation for SMC
采用VWT-CARS-PLSR模型对预测集进行验证,所得的PLSR模型的R2p=0.924、RMSEP=0.028、RPD=2.607。图5为该模型中实测值和预测值的散点图,可见模型的实测值样点和预测值样点基本均匀分布在1∶1线附近,模型精度较高,这说明WT-CARS耦合算法能够筛选出预测SMC的有效波段,减少建模变量数,有助于模型精度的提升。
图5 VWT-CARS-PLSR模型SMC预测值与实测值Fig.5 Relationship between measured and predicted soil moisture content by VWT-CARS-PLSR model
SMC的快速无损预测在对干旱地区农业干旱程度的评价中具有重要意义。而土壤高光谱反射率曲线是由不同土壤属性的综合表现,其中与SMC无关的噪声大量存在,同时波段数量较多,增加了数据的冗余,本研究通过小波变换与CARS算法相结合,筛选出有效变量,构建PLSR模型。构建CARS最优变量集之前,首先通过小波变换获取的各层特征光谱信号与SMC之间的相关性分析,确定小波变换最大分解层数为6层。而各层特征光谱在进行数据压缩时,也使得特定频率的光谱吸收特征得到凸显,而其他谱段的非相关光谱特征及噪声被抑制。陈至坤等[28]在对矿物油荧光光谱数据进行小波变换时,认为第4层分解在更多的保留原始信号的基础上实现对光谱数据的去噪处理;Zheng等[10]研究发现第8层小波变换能够更好地反映土壤属性特征光谱;王延仓等[29]研究发现小波第4层光谱特征所构建的有机质含量预测模型最佳。上述研究中的分解层数不尽相同,这是由于土壤类型、小波母函数、光谱特征重构的选择等的不同所导致。但上述研究均显示,模型在中等分解尺度出表现出最佳的效果,过低分解尺度的去噪效果不佳,而过度的分解又会随着高频信号的不断剥离,导致一些反应土壤属性特征的峰谷消失,使其对原始光谱的解释能力下降。本研究通过相关性分析确定了CARS最优变量集构建中最大分解层数为6层,其属于中等分解尺度,与上述结论较为一致。
高光谱提供了大量的连续光谱,而获取的原始高光谱数据通常噪声明显、散射严重,数据具有一定的冗余性[30],并且目前的研究已经表明冗余信息的存在能够削弱模型的预测性能和稳健性[31]。所以光谱变量的优选在土壤高光谱分析中是很必要的,不但能降低预测模型的复杂度,还能去除相关性较低的波段变量。目前常用的光谱变量筛选方法主要包括SPA(successive projections algorithm)、MC-UVE(Monte Carlo-uninformative variable elimination)、GA(genetic algorithms)和CARS(competitive adaptive reweighted sampling)等,詹白勺等[11]研究发现上述变量筛选方法能够从原始高光谱数据中有效地优选出敏感波段,并且其中CARS方法的筛选效果最佳;李江波等[32]比较了CARS与MC-UVE、GA变量筛选方法,发现CARS方法筛选结果最佳,其在减少无信息变量的同时,变量间的共线性也随之减小。本研究结合小波变换和CARS方法两者的优势,通过对土样不同分解层数进行CARS变量优选,最终得到的优选变量集包含了不同分解层数的优选变量,共131个变量,其分布在400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm区域之间,均位于水分吸收峰(450、1 400、1 900、2 200 nm)附近,并且这些波段与SMC均有较高的相关性,是适用于整体SMC的最优变量集,这与于雷等[12]研究结果基本一致。如果仅仅考虑一个分解层的CARS优选结果,这容易忽略其他分解层数的水分敏感波段,导致所选变量不能完全地反映土壤属性,建立的模型具有局限性。因此本研究优选出来的变量集可作为预测SMC的最优变量集。
通过WT-CARS耦合算法,SMC预测精度得到提高。相对于全波段预测模型来说,各层特征光谱的CARS优选变量的预测模型的精度都要高,并且最优变量集的预测模型的精度最高,RMSEP=0.028、=0.924、RPD= 2.607,说明WT-CARS耦合算法是土壤水分含量光谱分析有效的波段筛选方法。两者的有机结合在对无信息变量有效去除的同时,还可以尽可能的减少共线性变量对模型的影响,这为土壤其他属性敏感波段的筛选提供理论支撑。
本研究利用WT-CARS(wavelet transform-competitive adaptive reweighted sampling)耦合算法筛选出SMC(soil moisture content)的最优变量集,探究了该算法对土壤水分含量的预测效果,得出以下结论:
1)小波分解过程中,土壤反射率与SMC的相关性呈先增后减趋势,L6(小波变换第6层特征光谱)处通过0.01水平下的显著性波段达到最多,总体来说L6的特征光谱在去噪的同时,还最大限度的保留光谱细节,为本研究中的最大分解层。
2)通过对土样进行小波变换与CARS算法耦合筛选出变量,得到的最优变量集包括400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm区域之间共131个波长变量。
3)相对于全波段预测模型来说,各层特征光谱的CARS优选变量的预测模型的精度都要高,并且最优变量集的预测模型的精度最高,该模型的建模集均方根误差RMSEC=0.021、建模集决定系数=0.721、预测集均方根误差RMSEP=0.028、预测集决定系数=0.924、相对分析误差RPD=2.607,说明WT-CARS耦合算法使其在建立模型时尽可能少地损失光谱细节、较为彻底地去除噪声,同时还能对无信息变量进行有效去除,为该研究区SMC的预测提供新的思路。
[1] 邹文秀,韩晓增,江恒,等. 东北黑土区降水特征及其对土壤水分的影响[J]. 农业工程学报,2011,27(9):196-202. Zou Wenxiu, Han Xiaozeng, Jiang Heng, et al. Characteristics of precipitation in black soil region and response of soil moisture dynamics in Northeast China[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2011, 27(9): 196-202. (in Chinese with English abstract)
[2] 张定海,李新荣,陈永乐. 腾格里沙漠人工植被区固沙灌木影响深层土壤水分的动态模拟研究[J]. 生态学报,2016, 36(11):3273-3279. Zhang Dinghai, Li Xinrong, Chen Yongle. Simulation study on the effects of sand binding shrub on the deep soil water in arecovered area on the southeast fringe of Tengger Desert, North China[J]. Acta Ecologica Sinica, 2016, 36(11): 3273-3279. (in Chinese with English abstract)
[3] 孙越君,郑小坡,秦其明,等. 不同质量含水量的土壤反射率光谱模拟模型[J]. 光谱学与光谱分析,2015(8):2236-2240. Sun Yuejun, Zheng Xiaopo, Qin Qiming, et al. Modeling soil spectral reflectance with different mass moisture content[J]. Spectroscopy and Spectral Analysis, 2015(8): 2236-2240. (in Chinese with English abstract)
[4] Yin Z, Lei T, Yan Q, et al. A near-infrared reflectance sensor for soil surface moisture measurement[J]. Computers & Electronics in Agriculture, 2013, 99(99): 101-107.
[5] Blanco M, Coello J, Iturriaga H, et al. NIR calibration in non-linear systems: Different PLS approaches and artificial neural networks[J]. Chemometrics & Intelligent Laboratory Systems, 2000, 50(1): 75-82.
[6] 张锐,李兆富,潘剑君. 小波包-局部最相关算法提高土壤有机碳含量高光谱预测精度[J]. 农业工程学报,2017,33(1):175-181. Zhang Rui, Li Zhaofu, Pan Jianjun. Coupling discrete wavelet packet transformation and local correlation maximization improving prediction accuracy of soil organic carbon based on hyperspectral reflectance[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(1): 175-181. (in Chinese with English abstract)
[7] Kaewpijit S, Moigne J L, El-Ghazawi T. Automatic reduction of hyperspectral imagery using wavelet spectral analysis[J]. IEEE Transactions on Geoscience & Remote Sensing, 2003, 41(4): 863-871.
[8] 李瑞平,史海滨,张晓红,等. 基于小波变换的最大冻深期气温与土壤水盐特征分析[J]. 农业工程学报,2012,28(6):82-87. Li Ruiping, Shi Haibin, Zhang Xiaohong, et al. Characteristic analysis of temperature, soil water and salt during maximum freezing depth period based on wavelet transform[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE) , 2012, 28(6): 82-87. (in Chinese with English abstract)
[9] 廖钦洪,顾晓鹤,李存军,等. 基于连续小波变换的潮土有机质含量高光谱估算[J]. 农业工程学报,2012,28(23):132-139. Liao Qinhong, Gu Xiaohe, Li Cunjun, et al. Estimation of fluvo-aquic soil organic matter content from hyperspectral reflectance based on continuous wavelet transformation[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2012, 28(23): 132-139. (in Chinese with English abstract)
[10] Zheng L H, Li M Z, Pan L, et al. Application of wavelet packet analysis in estimating soil parameters based on NIR spectra[J]. Spectroscopy & Spectral Analysis, 2009, 29(6): 1549-1552.
[11] 詹白勺,倪君辉,李军. 高光谱技术结合CARS算法的库尔勒香梨可溶性固形物定量测定[J]. 光谱学与光谱分析,2014(10):2752-2757. Zhan Baishao, Ni Junhui, Li Jun. Hyperspectral technology combined with CARS algorithm to quantitatively determine the SSC in Korla Fragrant Pear[J]. Spectroscopy and Spectral Analysis, 2014(10): 2752-2757. (in Chinese with English abstract)
[12] 于雷,朱亚星,洪永胜,等. 高光谱技术结合CARS算法预测土壤水分含量[J]. 农业工程学报,2016,32(22):138-145. Yu Lei, Zhu Yaxing, Hong Yongsheng, et al. Determination of soil moisture content by hyperspectral technology with CARS algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(22): 138-145. (in Chinese with English abstract)
[13] 温珍才,孙通,许朋,等. 可见/近红外联合变量优选检测油茶籽油掺假[J]. 江苏大学学报:自然科学版,2015,36(6):673-678. Wen Zhencai, Sun Tong, Xu Peng, et al. Adulteration detection of camellia oils by Vis/NIR spectroscopy and variable selection method[J]. Journal of Jiangsu University: Natural Science Edition, 2015, 36(6): 673-678. (in Chinese with English abstract)
[14] 于霜,刘国海,梅从立,等. 基于变量筛选的固态发酵值近红外检测[J]. 计算机与应用化学,2014,31(9):1143-1146. Yu Shuang, Liu Guohai, Mei Congli, et al. NIRS detection of pH value in solid-state fermentation process based on variable selection method of CARS[J]. Computers and Applied Chemistry, 2014, 31(9): 1143-1146. (in Chinese with English abstract)
[15] 孙通,许文丽,林金龙,等. 可见/近红外漫透射光谱结合CARS 变量优选预测脐橙可溶性固形物[J]. 光谱学与光谱分析,2012,32(12):3229-3233. Sun Tong, Xu Wenli, Lin Jinlong, et al. Determination of soluble solids content in navel oranges by Vis/NIR diffuse transmission spectra combined with CARS method[J]. Spectroscopy and Spectral Analysis, 2012, 32(12): 3229-3233. (in Chinese with English abstract)
[16] 张华秀,李晓宁,范伟,等. 近红外光谱结合CARS 变量筛选方法用于液态奶中蛋白质与脂肪含量的测定[J]. 分析测试学报,2010,29(5):430-434. Zhang Huaxiu, Li Xiaoning, Fan Wei, et al. Determination of proteinand fat in liquid milk by NIR combined with CARS variables screening method[J]. Journal of Instrumental Analysis, 2010, 29(5): 430-434. (in Chinese with English abstract)
[17] Lin L, Wang Y, Teng J, et al. Hyperspectral analysis of soil organic matter in coal mining regions using wavelets, correlations, and partial least squares regression[J]. Environmental Monitoring & Assessment, 2016, 188(2): 1-11.
[18] Qian, Shie. Introduction to time-frequency and wavelet transforms[M]. Beijing: China Machine Press, 2005.
[19] 刘燕德,欧阳爱国,应义斌. 小波分析用于光谱信号处理及其在Matlab中的实现[J]. 传感技术学报,2006,19(3):821-823. Liu Yande, Ouyang Aiguo, Ying Yibin. Application of wavelet analysis in signal process using matlab[J]. Chinese Journal of Sensors and Actuators, 2006, 19(3): 821-823. (in Chinese with English abstract)
[20] Xu Changfa, Cai Chao, Pi Minghong, et al. Correlation wavelet and its applications[J]. Chinese Quarterly Journal of Mathematics, 1999, 14(1): 5-9.
[21] Kaewpijit S, Moigne J L, Elghazawi T. Spectral data reduction via wavelet decomposition[C]//Aerosense. 2002: 56-63.
[22] Li Hongdong, Liang Yizeng, Xu Qingsong, et al. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration[J]. Analytica Chimica Acta, 2009, 648(1): 77-84.
[23] 于雷,洪永胜,耿雷,等. 基于偏最小二乘回归的土壤有机质含量高光谱估算[J]. 农业工程学报,2015,31(14):103-109. Yu Lei, Hong Yongsheng, Geng Lei et al. Hyperspectral estimation of soil organic matter content based on partial least squares regression[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(14): 103-109. (in Chinese with English abstract)
[24] 薛利红,周鼎浩,李颖,等. 不同利用方式下土壤有机质和全磷的可见近红外高光谱反演[J]. 土壤学报,2014,51(5):993-1002. Xue Lihong, Zhou Dinghao, Li Ying, et al. Prediction of soilorganic matter and total phosphorus with Vis-NIR hyperspectral inversion relative to land use[J]. Acta Pedologica Sinica, 2014, 51(5): 993-1002. (in Chinese with English abstract)
[25] Shi Zhou, Wang Qianlong, Peng Jie, et al. Development of a national VNIR soil-spectral library for soil classification and prediction of organic matter concentrations[J]. Science China Earth Sciences, 2014, 57(7): 1671-1680.
[26] Viscarra Rossel R A, Behrens T. Using data mining to model and interpret soil diffuse reflectance spectra[J]. Geoderma, 2010, 158(1/2): 46-54.
[27] 于雷,洪永胜,周勇,等. 高光谱估算土壤有机质含量的波长变量筛选方法[J]. 农业工程学报,2016,32(13):95-102. Yu Lei, Hong Yongsheng, Zhou Yong, et al. Wavelength variable selection methods for estimation of soil organic matter content using hyperspectral technique[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE) , 2016, 32(13): 95-102. (in Chinese with English abstract)
[28] 陈至坤,张菡洁,王玉田,等. 基于小波变换的矿物油荧光光谱数据处理方法[J]. 激光杂志,2016(10):78-81. Chen Zhikun, Zhang Hanjie, Wang Yutian, et al. Fluorescence spectral date of mineral oil processing based on wavelet transform[J]. Laser Journal, 2016(10): 78-81. (in Chinese with English abstract)
[29] 王延仓,杨贵军,朱金山,等. 基于小波变换与偏最小二乘耦合模型估测北方潮土有机质含量[J]. 光谱学与光谱分析,2014(7):1922-1926. Wang Yancang, Yang Guijun, Zhu jinshan, et al. Estimation of organic matter content of north fluvo-aquic soil based on the coupling model of wavelet transform and partial least squares[J]. Spectroscopy and Spectral Analysis, 2014(7): 1922-1926. (in Chinese with English abstract)
[30] Hymer D C, Moran M S, Keefer T O. Soil water evaluation using a hydrologic model and calibrated sensor network[J]. Soil Science Society of America Journal, 2000, 64(1): 319-326
[31] 杨爱霞,丁建丽. 新疆艾比湖湿地土壤有机碳含量的光谱测定方法对比[J]. 农业工程学报,2015,31(18):162-168. Yang Aixia, Ding Jianli. Comparative assessment of twomethods for estimation of soil organic carbon content by Vis-NIR spectra[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(18): 162-168. (in Chinese with English abstract)
[32] 李江波,彭彦昆,陈立平,等. 近红外高光谱图像结合CARS算法对鸭梨SSC含量定量测定[J]. 光谱学与光谱分
析,2014(5):1264-1269.
Li Jiangbo, Peng Yankun, Chen Lioing, et al. Near-infrared hyperspectral imaging combined with CARS algorithm to quantitatively determine soluble solids content in “Ya” pear[J]. Spectroscopy and Spectral Analysis, 2014(5): 1264-1269. (in Chinese with English abstract)
Wavelet transformation coupled with CARS algorithm improving prediction accuracy of soil moisture content based on hyperspectral reflectance
Cai Lianghong, Ding Jianli※
(1. College of Resources & Environmental Science, Xinjiang University, Urumqi 830046, China; 2. Key Laboratory of Oasis Ecology, Ministry of Education, Xinjiang University, Urumqi 830046, China)
The rapid estimation of soil moisture content (SMC) is of great significance to precision agriculture in arid areas. Hyperspectral remote sensing technology has been widely used in the estimation of SMC due to that it’s non-destructive and rapid, and has high spectral resolution characteristics. Meanwhile, there are a lot of factors, such as massive spectral data, and surface conditions, which might affect the spectra, increasing the difficulty in extracting the effective information, and reducing the prediction accuracy of SMC. Noise reduction must be considered in developing hyperspectral estimation models, but how to reduce noise while retaining as much useful information as possible needs investigation. As advanced spectral mining methods, competitive adaptive reweighted sampling (CARS) was used to solve this problem in this study. In the present study, a total of 39 soil samples at 0-20 cm depth were collected from the delta oasis in Xinjiang. The samples were brought back to the laboratory to be dried naturally, ground and passed through a screen with 2 mm hole, and then filled into the black boxes with 12 cm diameter and 1.8 cm depth, which were leveled at the rim with a spatula. Reflectance of soil samples was measured using ASD (analytical spectral devices) Fieldspec 3 Spectrometer in a dark room. We used the following steps to process soil reflectance: First, discrete wavelet transformation (DWT) was used to decompose the original spectra in 8 levels using db4 wavelet basis with MATLAB programming language. In order to select the maximum level of DWT, correlation coefficients between the SMC and the spectra of each level were computed. Secondly, the CARS was used to filter the redundant variables, the wavelength variables with better correlation with SMC were screened out and the characteristic wavelength variables of each decomposition level were superimposed as the optimal variable set. Thirdly, partial least squares regression (PLSR) was employed to build the hyperspectral estimation models of SMC. And then, root mean square error of calibration set (RMSEC), determination coefficient of calibration set), root mean square error of prediction set (RMSEP), determination coefficient of predicting set) and relative prediction deviation (RPD) were used for accuracy assessment. The results showed that: 1) With the increase of the number of decomposed layers, the correlation between soil reflectance and SMC showed a trend of increasing first and then decreasing, and L6 was the most significant band at 0.01 level. In general, the characteristic spectrum of L6 was denoised, and at the same time, the spectral detail was preserved to the maximum extent, so the maximum decomposition order of the wavelet was 6-order decomposition. 2) The characteristic wavelength variable of the characteristic spectrum was selected by coupling wavelet transform and CARS algorithm. However, if only the CARS selection result of the feature spectrum was taken into account, it was easy to ignore the water features of other characteristic spectra. Therefore, in this study, by adding the characteristic wavelength variables of each layer as the optimal set of variables, it contained 131 wavelength variables near the absorption band (450, 1 400, 1 900, 2 200 nm). 3) Compared with the full-band PLSR model, the accuracy of PLSR model of CARS preferred variables for each decomposition level was high, and the PLSR model of the optimal variable set had the highest accuracy and a better performance in predicting SMC in the study area (RMSEC=0.021,=0.721, RMSEP=0.028,=0.924, RPD=2.607). It is shown that the combination of wavelet transform and CARS algorithm makes it possible to remove the noise as much as possible and to remove the noise completely when the model is established, and at the same time, it can effectively remove the non-information variable and provide a new idea of the screening of the SMC spectral variable in this region.
soil; moisture content; spectrum analysis WT; CARS; variable selection
10.11975/j.issn.1002-6819.2017.16.019
S127
A
1002-6819(2017)-16-0144-08
蔡亮红,丁建丽. 小波变换耦合CARS算法提高土壤水分含量高光谱反演精度[J]. 农业工程学报,2017,33(16):144-151.
10.11975/j.issn.1002-6819.2017.16.019 http://www.tcsae.org
Cai Lianghong, Ding Jianli. Wavelet transformation coupled with CARS algorithm improving prediction accuracy of soil moisture content based on hyperspectral reflectance[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(16): 144-151. (in Chinese with English abstract)
doi:10.11975/j.issn.1002-6819.2017.16.019 http://www.tcsae.org
2017-05-12
2017-07-10
国家自然科学基金(U1303381,41261090);自治区重点实验室专项基金(2016D03001);自治区科技支疆项目(201591101)
蔡亮红,男(汉族),贵州遵义人,主要从事干旱区遥感应用方面的研究。乌鲁木齐 新疆大学资源与环境科学学院, 830046。
Email:1173716776@qq.com.
※通信作者:丁建丽,男(汉族),新疆乌鲁木齐人,博士,博士生导师,主要从事干旱区生态环境遥感研究。乌鲁木齐 新疆大学资源与环境科学学院,830046。Email:2187736938@qq.com