蔡剑华,胡惟文,王先春(湖南文理学院信息研究所,常德415000)
基于组合滤波的鱼油二十碳五烯酸含量近红外光谱检测
蔡剑华,胡惟文,王先春
(湖南文理学院信息研究所,常德415000)
摘要:为了提高鱼油二十碳五烯酸(eicosapentaenoic acid, EPA)含量的测定精度,该研究将经验模态分解(empirical mode decomposition,EMD)和数学形态学滤波相结合的近红外光谱去噪方法应用于鱼油的一阶导数光谱预处理中,给出了方法的原理和步骤,评估了该方法的去噪效果。运用偏最小二乘回归(partial least squares regression, PLSR)建立了鱼油EPA近红外光谱的预测模型,用处理后的光谱计算了鱼油中EPA的含量,并与九点平滑和小波变换方法的处理结果进行了对比分析。结果表明:与传统的九点平滑处理结果相比,信噪比(signal to noise ratio,SNR)从14 dB左右提高到35 dB左右,原始信号与消噪信号之间的标准差由0.005 71降到0.002 26;预测集的决定系数由0.959 3提高到0.987 9,预测均方根误差(root mean square error, RMSE)由0.060 1降为0.031 2。证明了组合的EMD和数学形态学滤波方法在光谱处理过程中的可靠性,提高了鱼油EPA含量近红外光谱的定量分析精度。
关键词:光谱测定;模型;经验模态分解;数学形态滤波;近红外光谱;鱼油;去噪
蔡剑华,胡惟文,王先春.基于组合滤波的鱼油二十碳五烯酸含量近红外光谱检测[J].农业工程学报,2016,32(01):312-317.doi:10.11975/j.issn.1002-6819.2016.01.043 http://www.tcsae.org
Cai Jianhua, Hu Weiwen, Wang Xianchun.Near-infrared spectrum detection of fish oil eicosapentaenoic acid content based on combinational filtering[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE), 2016, 32 (01): 312-317.(in Chinese with English abstract)doi:10.11975/j.issn.1002-6819.2016.01.043 http://www.tcsae.org
Dyerber等人发现鱼油中所含二十碳五烯酸(eicosapentaenoic acid,EPA)和二十二碳六烯酸(docosahexaenoic acid,DHA)具有抗血栓和抗动脉硬化的医药作用,从此鱼油制药品和食品深受人们的喜爱和关注[1-2]。近几年来,近红外光谱分析技术成为了对鱼油中的EPA和DHA进行定性、定量分析研究的重要手段[2]。但在背景噪声下,鱼油吸收光谱的部分谱峰往往会被噪声淹没,难以辨别,而有价值的信息多以特征峰或相关峰形式存在[2-3]。所以为了凸显光谱的峰值信息,需要对光谱数据进行去噪处理,提高其信噪比。常用的平滑滤波方法有丢失光谱信号高频特征的缺陷[4-5]。小波变换和经验模态分解(empirical mode decomposition,EMD)近几年来被广泛应用到光谱的去噪中,取得了不错的效果,但仍存在不足,如:小波方法中的小波函数的选择,分解层数的确定[4-7],EMD分解中的模态混叠等[8]。本文结合EMD与数学形态学滤波各自的优点,提出了一种光谱预处理方法,并首次将它引入到鱼油EPA含量近红外光谱检测中来,探索一种提高近红外光谱测定鱼油EPA含量的新方法。
1.1经验模态分解和数学形态学基本变换
经验模态分解,由美籍华人科学家诺顿·黄提出,又叫Huang变换,它能将信号分解为一组固有模态函数(intrinsic mode function,IMF)和的形式,是一个从高频到低频的自适应过程,具体实现步骤见文献[9-10,12]。数学形态学是由一组形态学算子组成,形态学的基本运算包括腐蚀、膨胀、开和闭。噪声通常是在一定范围内作为一个峰值(“峰顶”或“谷低”)叠加在信号里面,而形态学的开操作可以用来剥离“峰顶”的噪声,形态学的闭操作可以用来填充“谷低”的噪声,二者都有滤波功能[13-15]。在实际应用中,开启和关闭操作经常结合形成形态学滤波算法。文章使用的广义形态滤波器是基于广义开-闭和闭-开运算来定义的,具体推导过程见文献[14,16],此处不再累述。
1.2预处理流程
基于EMD和数学形态学滤波的导数光谱预处理流程如图1所示。根据图1,方法与步骤可描述如下:
1)根据EMD方法,导数光谱被分解为一系列的模态函数IMFs,包括高阶模态函数h(n)和低阶模态函数l(n),接着将高频部分和低频部分分开,分别进行去噪处理。
2)对于低阶模态函数l(n),首先采用数学形态学滤波方法对其进行处理得到消噪后的部分g(n),然后用l(n)减去g(n),得到峰值信号f(n)。对于f(n),为了尽可能多保留光谱数据的有用成分,再用自适应阈值去噪方法对其进行二次分离,得到结果为f′(n)。最后,g(n)和f′(n)相加,其和就为低阶模态函数的去噪结果l′(n)。
3)对于高阶模态函数h(n),采用平滑滤波来消除基线漂移,得到滤波结果h′(n),将第2步和第3得到的去噪结果l′(n)和h′(n)相加,其和就视为原始导数光谱的去噪结果。
4)将去噪后的光谱数据与鱼油中的EPA等化学成分的基础数据进行关联,用定量分析软件Unscrambler建立模型,采用偏最小二乘法和交叉验证法,分析鱼油中的EPA含量。
图1 基于EMD和数学形态学滤波的光谱预处理流程图Fig.1 Flow-process diagram of pretreatment of near-infrared spectrum based on combinatorial method
1.3光谱数据采集
使用的便携式近红外光谱仪为:Mini-AOTF/(NIR),型号:Luminar5030,厂家:美国BRIMROSE公司。仪器波长范围为:1 300~2 300 nm,波长增量为:2 nm,扫描次数为:600。共标记48个样品,随机选用28个标记为校正集,20个为验证集,48个鱼油样品采用漫反射的测样方式采集光谱。各滤波方法、偏最小二乘回归(partial least squares regression, PLSR)、统计分析在Unscrambler 9.8和Matlab 7.0.1中实现。采集到的48个鱼油样品的原始光谱和一阶导数光谱如图2所示。可见,原始光谱比较光滑,但基线漂移较严重;导数光谱消除了基线漂移,但求导使随机误差也被放大,使信噪比显著降低,需要进行去噪处理。
图2 48个鱼油样品的原始光谱和一阶导数谱Fig.2 Original spectrum and first derivative spectrum of 48 fish oil sample
1.4预处理效果评价参数
对于光谱的去噪,通常要求在去噪后光谱曲线的特征位置和形状保持不变,并尽可能使曲线平滑[17]。为了探讨本文方法对近红外光谱的去噪效果,采用了信噪比(signal to noiseratio,SNR)、均方误差(rootmeansquareerror,RMSE)、横向特征保持指数(horizontal feature remain index,HFRI)和纵向特征保持指数(vertical feature remain index,VFRI)4项指标来对鱼油光谱去噪效果进行评估。用SNR和RMSE来反映方法的去噪能力;用光谱特征波段处的HFRI和VFRI来评价方法对光谱特征的保持能力。4个指标参数的计算如(1)-(4)式所示[18-21]。
横向特征保持指数:
纵向特征保持指数:
式(1)-式(4)中,f(mi)和表示去噪前后的光谱数据,i=1,2,…,N,N为波段数,i表示波段位置,(3)式表示以原始特征位置i为中心。
2.1鱼油光谱去噪
先以8号样品光谱为例,对鱼油光谱进行消噪处理,以评价本文方法的有效性。图3(a)为8号样品原始光谱图,可见所示光谱谱峰特征不够明显,往往为了提高光谱分析精度,需要对原始光谱进一步求导。图3(b)为8号样品一阶导数光谱图。由谱图可以看到,求导后光谱特征变得明显,光谱吸收峰变窄,但导数光谱也带来些新问题,降低了光谱信噪比,所以需要进一步进行消噪处理。
文章分别采用4种方法对光谱进行消噪预处理:9点多动平滑法、25点移动平滑法、小波软阈值去噪法(Heursure阈值)[22-23]和本文方法。采用4种方法对8号样品导数光谱去噪后的效果如图3(c)-图3(f)所示。光谱曲线滤波后,分别计算4项评价指标。首先计算去噪后光谱的SNR和RMSE;接着选取特征波段的位置计算特征保持指数HFRI和VFRI。本文以图3中明显吸收波段的位置作为特征波段,分别为以1 396、1 718、1 765、2 109、2 230 nm为中心,前后各5 nm作为有效范围,共5段。表1分别列出了几种去噪方法的4个评价参数的对照值。
图3 8号样品几种去噪方法的去噪结果比较Fig.3 Comparison of several de-noising methods for the spectrum of No.8 sample
由图3和表1可以看出:1)移动平滑法有效地去除了高频噪声,导数光谱SNR被提高,但也丢失了较多的有效信息,导致特征位置横向和纵向保持能力较差,其平滑效果和特征保持都不好,但随着平滑窗口的增大,光谱图形越光滑;2)小波软阈值方法去噪后对光谱的峰形影响不大,两个评价参数中,信噪比达到29 dB,而均方根误差仅为0.003左右,而且特征保持的也相对较好,是较好的滤波方法。3)形态小波的去噪效果较好,导数光谱噪声基本得到去除。4)本文提出的基于EMD和数学形态学滤波的方法去噪效果优于其他方法。其均方根误差仅为0.002 26,而信噪比达到了35.785,也很好保留了光谱信号的特征尖峰点,且横向特征保持指数(HFRI)和纵向特征保持指数(VFRI)都比其他几种方法好,体现了本文方法具有良好的细节保留和抗噪声性能。
表1 8号样品光谱几种消噪方法信噪比、均方根误差、波形横向特征保持指数和纵向特征保持指数对比Table 1 Comparison of several de-noising methods for SNR,RMSE,HFRI and VFRI
对其他47个样品的光谱也采用相同的方法做了消噪处理。图4对比了48个鱼油样品光谱经小波变换处理和本文方法处理后的去噪效果(由于小波阈值方法较平滑法去噪效果好,且为光谱预处理中常用的方法,为节约篇幅,此处仅给出小波软阈值方法与本文方法的对比)。从图4可以看出,小波阈值去噪方法消噪效果较好,对峰形也没有太大影响;基于EMD和数学形态学滤波的方法消噪效果更好些(对照始端和末端的光谱曲线可见),噪声得到了很好的抑制,且光谱的峰形没有太大变化,特征尖峰保留的很好。
图4 48个鱼油样品一阶导数近红外光谱去噪效果对比Fig.4 First derivative spectrum of 48 fish oil samples from different de-noising method
2.2EPA含量检测
从所采集48个鱼油样本中,随机选取20个样本,分别采用9点平滑法、小波软阈值、形态小波和本文方法对光谱进行去噪处理,将经过预处理后的光谱数据与鱼油中的EPA等化学成分的基础数据进行关联,分析鱼油中的EPA含量,以比较不同滤波方法对鱼油EPA含量检测的影响。用预测集中的决定系数(r2)和均方根误差(RMSE)来评价各去噪方法的优劣。从比较的结果得到:基于EMD和数学形态学滤波的方法是有效的,与常用的9点平滑滤波法处理结果相比,预测均方根误差RMSE由0.060 1降为0.031 2,预测集的决定系数r2由0.959 3提高到0.987 9,其处理效果相比小波软阈值方法及形态小波方法也更理想些,有效地提高了光谱的分析精度。
1)文章将经验模态分解与数学形态学滤波相结合,应用在鱼油近红外光谱预处理阶段,对比和参数评价表明该方法结合了两者的优点,是切实可行和有效的。
2)实验表明本文方法在真实保留鱼油光谱信号细节的前提下,极大程度的衰减了噪声,改善了鱼油吸收光谱信号较弱、造成部分谱峰淹没在噪声中难以辨别的现象,有效的改善了光谱质量,有助于后续鱼油成分分析的光谱建模。
3)本文方法与传统的9点平滑法相比,预测均方根误差RMSE由0.060 1降为0.031 2,预测集的决定系数r2由0.959 3提高到0.987 9,其处理效果相比小波软阈值方法及形态小波方法也更理想些,提高了模型预测精度。
[参考文献]
[1] Hans B P.Analysis of water in food by near infrared spectroscopy [J].Food Chemistry, 2003, 82(1): 107-110.
[2] Cai Jianhua.Near-infrared spectrum detection of fish oil DHA content based on empirical mode decomposition and independent component analysis[J].Journal of food and nutrition research, 2014, 2(2): 62-68.
[3]郝勇,陈斌,朱锐.近红外光谱预处理中几种小波消噪方法的分析[J].光谱学与光谱分析,2006,26(10):1838-1842.Hao Yong, Chen Bin, Zhu Rui.Analysis of several methods for wavelet de-noising used in near infrared spectrum pretreatment [J].Spectroscopy and Spectral Analysis,2006,26(10): 1838-1842.(in Chinese with English abstract)
[4]蔡剑华,王先春,胡惟文.基于形态小波的烟草尼古丁含量近红外光谱检测[J].农业工程学报,2012,28(15):281-286.Cai Jianhua, Wang Xianchun, Hu Weiwen.Near-infrared spectrum detection of tobacco nicotine content based on morphological wavelet[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE), 2012, 28 (15): 281-286.(in Chinese with English abstract)
[5]赵杰文,张海东,刘木华.简化苹果糖度预测模型的近红外光谱预处理方法[J].光学学报,2006,26(1):136-140.Zhao Jiewe, Zhang Haidong, Liu Muhua.Preprocessing methods of near-infrared spectra for simplifying prediction model of sugar content of apples[J].Acta Optica Sinica, 2006, 26(1): 136-140.(in Chinese with English abstract)
[6]马晶,谭立英,冉启文.光学小波滤波理论初探[J].中国激光,1999,A26(4):343-347.
[7]李素文,谢品华,李玉金,等.基于小波变换的差分吸收光谱数据处理方法[J].光学学报,2006,26(11):1601-1604.Li Suwen, Xie Pinhua, Li Yujin, et al.Wavelet transform based differential optical absorption spectroscopy data processing[J].Acta Optica Sinica, 2006, 26(11): 1601-1604.(in Chinese with English abstract)
[8]蔡剑华,王先春,胡惟文.基于经验模态分解的土壤有机质含量近红外光谱检测[J].农业机械学报,2010,41(9):182-186.Cai Jianhua, Wang Xianchun, Hu Weiwen.Near-infrared spectrum detection of soil organic matter content based on empirical mode decomposition[J].Transactions of the Chinese Society forAgriculturalMachinery, 2010, 41(9): 182-186.(in Chinese with English abstract)
[9] Huang N E, Shen Z, Long S R, et al.The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis [J],Proceedings of the Royal Society of London Series A, 1998, 454: 903-998.
[10] Huang N E, Wu M C, Long S R, et al.A confidence limit for the empirical mode decomposition and the Hilbert spectral analysis [J].Proceedings of the Royal Society of London Series A, 2003, 459: 2317-2362.
[11] Peng Z K, Tse P W, Chu F L.An improved Hilbert-Huang transform and its application in vibration signal analysis [J], Journal of Sound and Vibration, 2005, 286: 187-205.
[12] Rato R T, Ortigueira M D , Batista A G.On the HHT, its problems, and some solutions[J], Mechanical Systems and Signal Processing, 2008, 22: 1374-1394.
[13] Sun Y, Chan K L, Krishnan S M.ECG Signal conditioning by morphological filtering[J].Computers in Biology and Medicine, 2002, 32(6): 465-479.
[14] Maragos P, Schafer R W.A unification of linear median order statistics and morphological filters under mathematical morphology [J], Acoustics Speech and Signal Processing, IEEE International Conference on ICASSP’85, 1985, 10: 1329-1332.
[15] Song J S, Delp E J.A study of the generalized morphological filter [J], Circuits, Systems and Signal Processing, 1992, 11: 229-252.
[16] Sun Y, Chan K L, Krishnan S M.ECG signal conditioning by morphological filtering [J],Computers in Biology and Medicine, 2002, 32: 465-479.
[17]刘洁,李小昱,李培武,等.基于近红外光谱的板栗水分检测方法[J].农业工程学报,2010,26(2):338-341.Liu Jie, Li Xiaoyu, Li Peiwu, et al.Determination of moisture in chestnuts using near infrared spectroscopy [J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE), 2010, 26(2): 338-341.(in Chinese with English abstract)
[18]黄明祥,王珂,史舟,等.土壤高光谱噪声过滤评价研究[J].光谱学与光谱分析, 2009, 29(3): 722-725.Huang Mingxiang, Wang Ke, Shi Zhou1,et al.Quantitative evaluation of soil hyperspectra denoising with different filters[J].Spectroscopy and Spectral Analysis, 2009, 29(3): 722-725.(in Chinese with English abstract)
[19] Battista B,Knapp C,McGee T,et al.Application of the empirical mode decomposition and Hilbert-Huang transform to seismic reflection data[J], Geophysics, 2007, 72: H29-H37.
[20]蔡剑华,王先春,胡惟文.基于形态小波的烟草尼古丁含量近红外光谱检测[J].农业工程学报,2012,28(15):281-286.Cai Jianhua, Wang Xianchun, Hu Weiwen.Near-infrared spectrum detection of tobacco nicotine content based on morphological wavelet[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE), 2012, 28 (15): 281-286.(in Chinese with English abstract)
[21] Donoho DL.De-noisingbysoft-thresholding[J],IEEETransactions on Information Theory, 1995, 41: 613-627.
[22] Trad D O, Travassos J M.Wavelet filtering of mag-netotelluric data[J].Geophysics,2000, 65(2): 482-491.
Near-infrared spectrum detection of fish oil eicosapentaenoic acid content based on combinational filtering
Cai Jianhua, Hu Weiwen , Wang Xianchun
(Information Institute, Hunan University of Arts and Science, Changde, 415000, China)
Abstract:The near-infrared(NIR)spectral analysis technology has become an important method in the qualitative and quantitative analysis of the composition of fish oil.Yet the absorption spectrum signal of fish oil is generally weak.Especially, when the NIR spectrum is applied to the component analysis, part of the spectrum peaks are often submerged in the noise and difficult to be identified.In order to improve the accuracy of non-destructive detection of eicosapentaenoic acid(EPA)content of fish oil, a combined method was proposed to conduct the pretreatment of fish oil NIR spectrum based on the empirical mode decomposition(EMD)and the morphological filtering.The principle and steps of the method were given.Firstly, derivative spectra were decomposed into a series of modal functions based on the EMD, including high-order and low-order modal function.Then the high-order part and low-order part were separated to deal with respectively.For low-order modal function, the mathematical morphology filtering method and the adaptive threshold de-noising method were used to de-noise to retain useful spectral data as much as possible.For high-order modal function, smoothing filter was used to eliminate baseline drift.Then the sum of 2 parts was determined as the de-noised spectrum.Finally, after de-noising, the correlation analysis was conducted between spectral data and the EPA chemical composition data in fish oil.The partial least squares regression was adopted to establish the prediction model, and the EPA content of fish oil was calculated from the de-noised spectrum.The spectra of 48 fish oil samples were collected using a portable NIR spectrometer(Mini-AOTF/(NIR)), which was produced by Brimrose company in the United States of America.The model of the NIR spectrometer was Luminar 5030, the wavelength range was 2 300~1 300 nm, the wavelength increment was 2 nm and the scanning time was 600.Randomly, 28 fish oil samples were selected and marked as calibration set, and 20 fish oil samples were selected as validation set.The nine-point smoothing method, the wavelet soft-threshold, the morphological wavelet and the proposed method were respectively used as pretreatment method to deal with the spectrum.Then the EPA content of fish oil was calculated based on the de-noised spectrum and a comparative analysis of their results was conducted.The filtering method and the statistical analysis were implemented in Matlab 7.0.1.The result of the presented method was compared with that of the nine-point smoothing method which was the most traditional method.It could be seen that the signal-noise ratio(SNR)was improved from 14 to 35 dB, and the root mean square error (RMSE)between raw signal and de-noised signal was reduced from 0.005 71 to 0.002 26.These embodied the proposed method had a good performance in the retention and resistance to noise.The determination coefficient of the prediction set was improved from 0.959 3 to 0.987 9, and the RMSE was reduced from 0.060 1 to 0.031 2.The model prediction accuracy was improved.And the treatment effect was also better than the wavelet soft-threshold method or the morphological wavelet method which were widely used in the preprocessing of the spectrum.The experimental results showed that the proposed method combined the advantages of EMD and mathematical morphology filter.Under the premise that real details of fish oil spectrum signal were kept, the noise was attenuated at the maximum degree.After de-noising, the spectrum peak which was submerged in noise became clear and easy to be identified, and the quality of spectrum data was improved effectively.These improve that the proposed combined method is effective to conduct the pretreatment of NIR spectrum of fish oil and improves the accuracy of NIR spectrum detection of fish oil EPA content.The combination of EMD and morphological filtering also provides a new way for NIR spectra de-noising.
Keywords:spectrometry; models; empirical mode decomposition; morphological filtering; near-Infrared spectrum; fish oil; de-noising
作者简介:蔡剑华,男,湖南桂阳人,副教授,博士,主要从光电信号处理等方面的研究工作。常德湖南文理学院信息研究所,415000。Email: cjh1021cjh@163.com
基金项目:国家自然科学基金项目(41304098);湖南省教育厅青年项目(13B076);湖南省重点建设学科-光学基金;湖南文理学院博士启动项目。
收稿日期:2015-07-26
修订日期:2015-11-17
中图分类号:O657.3
文献标志码:A
文章编号:1002-6819(2016)-01-0312-06
doi:10.11975/j.issn.1002-6819.2016.01.043