Influence of signal-to-noise ratio on accuracy of spectral analysis by near infrared spectroscopy

2020-08-25 04:50ZHUANGXingangSHIXueshunLIUHongboLIUChangmingZHANGPengjuWANGHengfei

ZHUANG Xin-gang, SHI Xue-shun, LIU Hong-bo, LIU Chang-ming,ZHANG Peng-ju, WANG Heng-fei

(1. The 41st Research Institute of China Electronics Technology Group Corporation, Qingdao 266555, China;2. National Opto-Electronic Primary Metrology Laboratory, Qingdao 266555, China;3. Science and Technology on Electronic Test & Measurement Laboratory, Qingdao 266555, China)

Abstract:As one of the important indicators of spectrometer, signal-to-noise ratio(SNR)reflects the ability of spectrometer to detect weak signals.To investigate the influence of SNR on the prediction accuracy of spectral analysis, we first introduce the major factors affecting the spectral SNR.Taking green tea as an example, the influence of spectral SNR on the prediction accuracy of the origin identification model is analyzed by experiments.At the same time, the relationship between the spectral SNR and prediction accuracy of spectral analysis model is fitted.Based on this, the common methods for improving the spectral SNR are discussed.The results show that the accuracy of the prediction set model first decreases slowly, then decreases linearly, and finally tends to be flat as the spectral SNR decreases.Through calculation, in order to achieve the prediction accuracy of prediction model reaching 90% and 85%, the spectral SNR is required to be higher than 23.42 dB and 21.16 dB, respectively.The overall results provide certain parameters support for the development of new online analytical spectroscopic instruments, especially for the technical indicators of SNR.

Key words:near infrared spectroscopy; signal-to-noise ratio(SNR); partial least squares(PLS); spectral analysis; green tea

0 Introduction

During the past decades, great development has taken place on the near infrared(NIR)spectroscopy.NIR spectroscopy has developed into one of the most highly regarded high-tech analysis technologies and has been widely applied in many fields[1-4].As an indirect analysis method, NIR spectroscopy is composed of NIR spectroscopy instrument, chemometric software and analytical model.NIR spectroscopy instrument is the basis for NIR spectroscopy, which determines whether the analytical method can come into use.Different from large-scale and high-precision optical analysis instruments used in laboratory, miniaturized and portable spectrometer will be one of the important trends of spectrometry in the application.The development of miniaturization and integration of spectroscopic instruments will inevitably sacrifice the performance index of the spectrometer.In addition, the complex scattering environment in the process of online detection will inevitably affect the quality of spectra, especially the signal-to-noise ratio(SNR).SNR, as one of the evaluation parameters of the spectrometer, has an important influence on the analysis accuracy of NIR spectroscopy.Hence, a trade-off between the structural parameters, price, and analytical accuracy is required for an optimal spectral analysis scheme, in the development of spectroscopic instruments.

In view of the influence of SNR on the accuracy of spectral analysis, only a few scholars have made preliminary research and analysis.In the application of airborne and spaceborne remote sensing, Moses et al.elucidated the effect of SNR on the accuracy of retrieved constituent in coastal waters such as total suspended solids(TSS), colored dissolved organic matter(CDOM), chlorophyll-a, etc.The results show that improving the SNR by reasonably modifying the sensor design can reduce estimation uncertainty by 10% or more[5].Banas et al.of the National University of Singapore investigated the influence of SNR on the identification of high explosive substances by applying multivariate statistical methods to the Fourier transform infrared spectral data sets[6].Li et al.of Tianjin University analyzed the relationship between SNR and quantitative analysis precision[7].However, most recent researches focus on large spectral instruments, but lacking the influence of SNR on the quality identification accuracy of portable optical fiber spectrometer from the experimental point of view.

In this paper, the factors which influence the SNR of optical fiber spectrometer are expounded based on the results obtained by our research group in recent years.At the same time, the influence of spectral SNR on the identification accuracy of green tea origin is analyzed by experiments.The overall results will provide some references for the development of new online micro-spectral instruments and the application of spectral analysis.

1 Experimental

1.1 Noise model

SNR is an important indicator to characterize the radiation sensitivity of spectrometers, which determines the ability of the spectrometer to detect weak signals.There are many factors affecting the SNR of spectrometer.Generally, the SNR of fiber-optic spectrometer based on linear array detector can be expressed by[8]

(1)

wherePf(λ)is the incoming radiance at the sensor in the wavelengthλ;Qe(λ)is the quantum efficiency of photodetector in the wavelengthλ;tis the integration time;Bf(λ)is stray light from the background reflecting photons in the wavelengthλ, which is determined by external environment and the spectrometer itself;Dis a dark current which is the source of dark noise;Nris readout noise, which is related to readout speed.Generally speaking, the higher the readout speed, the greater the noise.When the readout speed is constant, the readout noise can be regarded as a constant.

When the readout detector is determined, the factors affecting the SNR of spectral data acquisition system mainly come from three aspects.

1)The effective incoming radiance at the detector and stray light.The greater the ratio of incoming radiance to stray light(Pf/Bf), the higher the SNR.

2)The integration time.The SNR can be effectively improved by increasing the integration time, but the detection rate will go down.Besides, the integration time should match the size of incoming radiance.

3)Readout noise.Readout noise is mainly determined by the circuit system.Specifically, effective incoming radiance and stray light are the main factors of the three aspects.The effective incoming radiance is mainly related to optical efficiency of the system, which depends on the parameters of the optical system, including the relative aperture, the transmittance of the incident light, and so on; Stray light comes from many sources, which can be divided into background stray light and thermal radiation stray light caused by the instrument itself.Background stray light mainly depends on the working environment of the instrument, which is mainly affected by sunlight when working outdoors.In addition, the reflection of incident light inside the optical system is also an important source of stray light.

1.2 Sample preparation and spectra collection

In our study, 220 real and respective green tea samples(110 Laoshan and 110 Rizhao green tea samples)were collected from Laoshan and Rizhao(Two main green tea producing areas from Shandong province).All 220 samples were randomly divided into calibration set(77 Laoshan and 77 Rizhao green tea samples)and prediction set(33 Laoshan and 33 Rizhao green tea samples).Therefore, the prediction set had 33 positive(Laoshan green tea)objects and 33 negative(Rizhao green tea)objects.

The NIR spectra in the range of 1 050-2 500 nm were collected in the reflectance mode using an AvaSpec-NIR256/2.5TEC spectrometer(Avantes, Netherlands).The spectrometer and green tea spectra collection device are presented in Fig.1.Ten spectra were collected for each sample from different places, and each spectrum was the average of 40 scans.The raw spectral data were measured in 6.4 nm intervals, which resulted in 227 variables.For each sample, the mean of the 10 spectra was applied in the subsequent analysis.

Fig.1 Spectrometer and green tea spectra collection device

Then, the region of 1 300 nm to 2 300 nm was selected for further analysis since both ends of the spectra exhibited a high level of noise, as presented in Fig.2.Laoshan green tea and Rizhao green tea are colored in red and blue, respectively.For each sample, 30±0.1 g of tea leaf was filled into a 200 mL beaker and pressed to keep the surface flat, without any other pretreatment for samples.The distance between the probe and green tea was kept at 1 cm.All samples were stored in a cool and dry freezer before spectra collection.The room temperature was kept at 25 ℃, and the humidity kept an ambient level in the laboratory.

Fig.2 Laoshan and Rizhao green tea spectroscopy

The SNR of original spectra acquired by the spectrometer is 34.31 dB.During the experiments, Matlab was used to add random white noise to the original spectral data with different root-mean-square(RMS)values, and a new spectral database of green tea with a scope of the SNR gradient from 4.77 dB to 34.31 dB was constructed.Fig.3 shows five green tea spectra with different SNRs.It is obvious that the smoothness of spectral curve becomes worse and worse as the SNR decreases, which will cause more material absorption information to be submerged in the noise.

Fig.3 Green tea spectral comparison chart with different SNRs

1.3 Software

For the spectra collection, AvaSoft(AvaSpecTEC system)was used.And all algorithms were implemented by using a selfdeveloped NIR analysis software ARCO-NIR, which was developed in Matlab programming language by Matlab 2010a(Mathworks Co., USA)under Windows 7.

2 Results and discussion

2.1 Influence of spectral SNR

The variety memberships of samples were coded as a dummy label by assigning a value of 1 for Laoshan green tea and 2 for Rizhao green tea.After the original spectra were added random white noise with different RMS values, 16 regression tools between NIR spectra and green tea origins(dummy labels)were built up by partial least squares(PLS).The quality of the identification models was assessed according to the values of sensitivity, specificity and prediction accuracy.For both calibration set and prediction set, the prediction results with different SNRs are shown in Figs.4(a)and(b).

As can be seen from the diagram, a nearly linear relationship between the sensitivity, specificity, prediction accuracy and the SNRs of both calibration set and the prediction set apparently exist.Only difference is that the sensitivity, specificity and prediction accuracy of training set are basically consistent with the change of SNR.For prediction set, the variation tendency of sensitivity and specificity appears to be separated when SNR drops to 17 dB, and the value of sensitivity is significantly lower than specificity, as can be seen from Fig.4(b).The experimental results show that the spectral SNR plays a decisive role in modeling analysis.The spectrometer with high SNR is the necessary assurance to improve the accuracy of quality identification.

Fig.4 Effect of SNR on prediction accuracy of green tea origins identification model

It is worth reminding that the prediction accuracy of the correction set and the prediction set does not completely show a linear downward trend as the SNR decreases according to the experimental results.Taking the prediction set as an example, the prediction accuracy of the prediction model first decreases slowly, then decreases linearly, and finally tends to be flat.The first inflection point appears at the SNR of approximately 23 dB, where the prediction accuracy is approximately 90%.

To accurately describe the relationship between the prediction accuracy and the spectral SNR quantitatively, the relationship between prediction accuracy of green tea samples and SNR is linearly fitted, as shown in Fig.5.The linear fitting formula is

Fig.5 Fitted curve for SNR to prediction accuracy of green tea samples

AP=a1S5+a2S4+a3S3+a4S2+a5S+a6,

(2)

whereAPis prediction accuracy; the values ofa1toa6are-0.000 001 346,-0.000 117 3,-0.003 716,-0.051 55,-0.316 3 and-0.162, respectively;Sis the opposite of SNR.According to the fitted curve, the minimum SNRs corresponding to different prediction accuracy values are calculated, as listed in Table 1.In particular, the minimum SNRs corresponding to the prediction accuracy of 90% and 85% are 23.42 dB and 21.16 dB, respectively.

Table 1 The minimum SNR corresponding to different gradient prediction accuracy values

2.2 Methods to improve SNR

In the case where the spectrometer is determined, the following measures can be taken to improve the spectral SNR to a certain extent: 1)Multi-sampling and accumulate the average value.Multi-sampling is the most commonly used method to improve SNR in the process of spectral analysis, which can effectively eliminate white noise[9-11].Generally, the SNR increases linearly with the square root of average times[12].2)Extend the integration time.In the case where the effective incoming radiance is constant, the longer the integration time, the more the charge accumulates, and the total amount of charge is proportional to the integration time.At the same time, the noise signal also increases, which include shot noise, dark current noise, amplifier noise, and fixed noise due to dark current instability.Since they are all random noise, which exhibits no linear relationship to the integration time.As a result, when the integration time is extended tomtimes, the spectral SNR can be increased bymtimes in the premise that detector is not saturated[13-14].3)Cut the incoming light.The time-domain spectral data can be converted to frequency-domain information by cutting the incident light using an optical chopper before it enters the detector, which can turn the DC spectral electrical signal into AC signal and effectively improve the spectral SNR.Zhan et al.improved the system SNR by more than 100 times by using optical chopper to modulate the incident light, and designed a special signal processing circuit[15].4)Modulate the polarization state of incident light.Gobrecht and Han et al.built a polarization spectra acquisition system based on the traditional fiber optic spectrometer by combining the polarization spectrum modulation device with spectrometer.By separately modulating the incident light and reflected light, it can effectively suppress stray light interference scattered by the sample particles[16-18].5)Chemometric methods.The commonly used chemometric methods include smoothing, differential processing, and the like[19].Smoothing can effectively remove random noise, and differential processing is beneficial to the spectral details of the sample, but it also amplifies spectral noise[20-21].In general, the appropriate spectral pretreatment methods are selected based on the modeling results.

3 Conclusion

The practical application of spectral analysis requires spectral analysis instruments to develop towards the direction of miniaturization and portability, which will inevitably sacrifice spectral resolution, SNR and other technical indicators.In addition, online analysis is more easily interfered by external stray light and other factors compared with laboratory analysis, which will further reduce spectral SNR.In order to explore the influence of SNR on the accuracy of spectral analysis, the factors affecting the spectral SNR are introduced first.And then taking green tea as an example, the influence of spectral SNR on the prediction accuracy of the green tea origins identification model is analyzed by experiments.On this basis, the quantitative relationship between spectral SNR and spectral analysis accuracy is obtained according to curve fitting.At last, the common methods for improving spectral SNR are analyzed.This study provides an important reference value for the practical application of spectral analysis technology and the development of miniaturized spectrometric instruments.