Identification of Haploid Maize Kernel Using NIR Spectroscopy in Reflectance and Transmittance Modes: A Comparative Study

2016-06-15 16:36QINHongMAJingyiCHENShaojiangYANYanluLIWeijunWANGPingLIUJin
光谱学与光谱分析 2016年1期
关键词:单倍体朝向识别率

QIN Hong, MA Jing-yi,, CHEN Shao-jiang, YAN Yan-lu,LI Wei-jun*, WANG Ping, LIU Jin

1. Institute of Semiconductors,Chinese Academy of Sciences,Beijing 100083,China 2. College of Information and Control Engineering,China University of Petroleum (Huadong),Qingdao 266580,China 3. National Maize Improvement Center, China Agricultural University, Beijing 100193, China 4. College of Information and Electrical Engineering,China Agricultural University,Beijing 100083,China

Identification of Haploid Maize Kernel Using NIR Spectroscopy in Reflectance and Transmittance Modes: A Comparative Study

QIN Hong1, MA Jing-yi1,2, CHEN Shao-jiang3, YAN Yan-lu4,LI Wei-jun1*, WANG Ping2, LIU Jin3

1. Institute of Semiconductors,Chinese Academy of Sciences,Beijing 100083,China 2. College of Information and Control Engineering,China University of Petroleum (Huadong),Qingdao 266580,China 3. National Maize Improvement Center, China Agricultural University, Beijing 100193, China 4. College of Information and Electrical Engineering,China Agricultural University,Beijing 100083,China

The spectra measurements mode that suitable for haploid maize kernel identification was explored using MicroNIR-1700 series of miniature near infrared spectrometer by JDSU company. Based on Near Infrared Spectroscopy (NIRS) qualitative analysis techniques, we conducted a comparative study using reflectance and transmittance spectra to identify haploid maize kernels. Partial least squares-discriminant analysis(PLS-OLDA) was used to compress the pretreated spectral data, and then the identification models were built based on Support Vector Machine (SVM). The measured data were recorded in reflectance and transmittance modes and the recognition correct rates were calculated. For measurements taken in reflectance mode, the average recognition rate was less than 60% regardless of embryo side positions. In transmittance mode, however, the average recognition rate reached 93.2%. The experiment results show that diffuse reflection spectrum could only obtain corn grain surface information, so embryo side positions severely affect haploid maize kernel identification effect when reflectance measurements mode have been employed, but they have far less impact on transmittance mode. The near infrared diffuse transmittance spectra analyzes non-uniform samples can achieve the analysis of optical path depth information accumulation, all information of the sample interior can be obtained, so transmittance spectra could identify haploid maize effectively and be desensitized to kernel positions. NIRS qualitative analysis techniques with features of rapid, nondestructive could identify the haploid and Micro-NIR spectrometer scan fast and cost less, which have utility for automatically selecting haploid maize kernels from hybrid kernels.

Near Infrared Spectroscopy; Haploid maize identification; Reflectance spectra;Transmittance spectra; Qualitative analysis

Biography:QIN Hong, (1977—), Female, Engineer, Institute of Semiconductors, Chinese Academy of Sciences e-mail:qinh@semi.ac.cn *Corresponding author e-mail:wjli@semi.ac.cn

Introduction

It can accelerate the process of breeding and improve the efficiency of breeding, by using the haploid technology for getting pure line and then breeding inbred lines. Thus, in recent years, the haploid breeding of maize on the basis of biological induction has gradually become one of the key technologies of maize breeding[1]. Due to the low probability of natural production of haploid (0.05%~0.1%), less than 10% even artificially induced[2], it is significant for maize haploid breeding to study how to rapidly and accurately identify the haploid kernel from the induction produced large amounts of kernels.

At present, the conventional method for identification of haploid breeding units is the genetic marker method[3], which mainly rely on color indication of the kernel and artificial means for the identification and separation of kernel haploid. Artificial selection, relying on naked-eye observation, easily leads to fatigue of vision and brain, reduces efficiency while increases misidentification, and is subjectively, laborious and time consuming. In addition, a lot of material is very weak for color indication, this will lead to identification efficiency is reduced, the identification result is not ideal. Therefore, we need to develop rapid identification technology which is easier for automation implementation. Zhang Junxiong, etc.[4]studied a feature extraction and dynamic recognition method for maize haploid seeds embryos. The correct recognition rate for maize haploid is 98.04%, for chimeras is 94.44%. The method is based on machine vision technology, suitable for varieties with clear color indication, and needs to identify seed embryo surface, and place maize seeds according to the orientation of embryo surface, not easy to implement automation in its true sense. Liu Jin, etc.[5], studied pollen xenia effect and nuclear magnetic resonance (NMR) technology based on the oil content, to separate maize haploid kernels adopting the method of oil content detection, with an average recognition rate of 92.3%, recognition speed of 4 sec/kernels. This method is of good results in detection and separation, but can only identify kernels of high oil content inducer, and is difficult to popularize for public due to the expensiveness of NMR instruments.

NIRS qualitative analysis techniques with features of rapid, nondestructive, low cost detection, easy to operate, etc.[6], are very extensive in the applied research of crop seed identification, and superior results of identification have been achieved. But the NIR qualitative research for haploid maize seed identification has not yet been reported. At present, the conventional near infrared spectral analysis mainly focus on diffuse reflectance spectra, the object samples for diffuse reflectance spectra analysis request uniform samples, and need to meet certain quality or volume requirements. Maize seeds are different in size and shape, and the concentration of composition within the kernel are uneven, the difference of shape is small between maize haploid and polyploid of the same variety, the nature of the differences between them exist in the kernel interior especially embryo. Using diffuse reflectance spectrometry to analyze a single seed, the size, shape, surface morphology and position placed, etc. of the seed will severely affect the results of the analysis, which is called position effect. Position effect leads to the conventional near infrared spectra analysis technique is not applicable to single kernel seed, which is one of the main reasons for that single kernel seed near infrared spectra analysis is currently not practical. While the near infrared diffuse transmittance spectra analyzes non-uniform samples can achieve the analysis of optical path depth information accumulation, all information of the sample interior can be obtained, the influence of position effect to analysis is reduced to a great extent. In this paper, qualitative near infrared spectroscopy analysis method is applied to the identification of maize haploid kernels, and identification results of diffuse reflectance spectra and diffuse transmittance spectra are compared. Experimental results show that in the case of regardless of embryo orientation, spectra obtained from the way of diffuse reflectance measuring cannot effectively identify maize haploid. While adopting diffuse transmittance measurement method, of which the near infrared spectra composition carries more information of the kernel interior, so as to achieve the effective identification of maize haploid and polyploid. Diffuse transmittance identification method based on micro spectrometer is of no special requirements for samples, simple operation, fast speed, low cost, easy to implement practical automatic identification and sorting system for maize haploid seeds.

1 Experiments

1.1 Instruments and equipment

For instrument we use MicroNIR-1700 series of miniature near infrared spectrometer by JDSU company, schematic diagram is shown in Fig.1. Instrument parameters are as follows: light source are the double integration vacuum tungsten lamps, spectral components: linear variable filter (LVF), probe types: 128 linear elements uncooled indium gallium arsenic (InGaAs) diode array, wavelength range: 950~1 650 nm, resolution: 12.5 nm, measuring time (typical): 0.25 seconds. Data analysis software is Matlab2010b (the United States, the Mathworks company).

Fig.1 Schematic diagram of the MicroNIR reflectance measurements

Experiments were divided into diffuse reflectance and diffuse transmittance of two groups: diffuse reflectance experiment used the built-in light source of micro spectrometer, i.e. double integration vacuum tungsten lamps, light illuminated the maize kernel from the bottom, the optical signal detector captured was the diffuse reflectance of the maize kernel; the built-in light source was shut in the diffuse transmittance experiment, the halogen tungsten lamp was used as external light source, light illuminated the maize kernel from the top diagonal, the optical signal detector captured was the diffuse transmittance of the maize kernel.

1.2 Sample source and spectra acquisition

The haploid and polyploid of maize kernels, provided by national maize improvement center, which are Navajo genetic marker imported and hybridization induced, are experimented as the research object.

In diffuse reflectance experiments, the data was collected for five days (October 16, 2013, October 17, 2013, October 18, 2013, October 21, 2013 and October 22, 2013), 100 each haploid and polyploid spectra were collected every day, including 35 kernel embryo face down and 35 kernel embryo face up, 30 seed kernels were placed randomly. The data of five days were numbered as R1~R5 according to the sequence of collection time. Spectral curves are shown in Fig.2(a).

In diffuse transmittance experiments, data was collected for three days (May 26, 2014, May 27, 2014 and May 28, 2014), a set of data was collected in the morning and another in the afternoon every day with a total of 6 sets of data, 50 spectral data for haploid and polyploidy in each set, all kernels are randomly placed. The data of three days were numbered as T1~T5 according to the sequence of collection time. Spectral curves are shown in Fig.2(b). It is observed from the spectrogram, absorbance range of diffuse reflectance spectra is 0.15~0.45, the discrete degree is about 0.3; and absorbance range of diffuse transmittance spectra is 0~0.15, the discrete degree is about 0.15. The same kind of corn seeds were with similar structure and composition. Near-infrared diffuse transmission spectrum of single grain reflects its overall structure and components, so near infrared spectrum of the same kind maize seeds was with the relatively closer characteristics and the smaller discrete degree, this is not the foundation of the same kind of corn seed identification. This is the identified foundation of different kinds of maize seed. While the diffuse reflection spectrum is different. If the endosperm of seeds was faced with light, the starchy material of endosperm (characteristic compooents) was with stronger absorption of light, reflected in the diffuse reflection spectrum was with the relatively stronger O—H characteristic peak. If the embryo of seeds was faced with light, the protein material of endosperm (characterisuic components) was with stronger absorption of light, reflected in the diffuse reflection spectrum was with the relatively stronger N—H characteristic peak. The actual measured spectra of these two types of seed spectrum was usually mixed together, resulting in the discrete degree of the diffuse reflectance spectral set was greater than the diffuse transmittance and therefore the accuracy of the seed identification was affected. Compared with diffuse reflectance spectra, the discrete degree of absorbance for diffuse transmittance spectra is smaller; the accuracy of spectral analysis is higher[7].

Fig.2 Schematic diagram of the spectral curve

1.3 Spectral preprocessing, feature extraction and modeling

The preprocessing for original spectral data[8]applies the combination of Smoothing, First Derivative (FD) and Vector Normalization (VN) (this section is not the key point studied in this paper, thus here is no detailed introduction).

After above preprocessing for the original spectra, based on the method of literature[8], PLS+OLDA is used for data feature extraction. Partial least-squares regression (PLS)[9]data decomposition and regression were combined to one step, the obtained eigenvalue vectors were directly related to the nature of varieties classified, the extracted comprehensive composition can maximally reflect the features of category information. Orthogonal linear discriminant analysis (OLDA) is an improvement of linear discriminant analysis (LDA), which is a kind of classical effective method of dimension reduction. By finding a projection matrix composed of discriminant vector, the projection of raw data towards low dimension space, makes similar samples as focused as possible, non-similar sample as disperse as possible, i.e. maximize the ratio of distribution of inter-class and intra-class[10]. The OLDA[11]makes the discriminant vector a set of mutually orthogonal projection vector.

In this paper, support vector machine (SVM) method is adopted to build the maize haploid identification model. SVM is a machine learning method, through a nonlinear mapping, the sample!space is mapped into a feature space of high dimension even infinite dimension, making the nonlinear separable problem in original sample space transformed into a linear separable problem in feature space[12]. The SVM method is often used in binary classification problems, thus we choose SVM as classifier for maize haploid and polyploid identification problems.

The experimental data, including reflectance and diffuse transmittance, were processed using the same algorithm. The first step, the PLS algorithm was used to reduce the dimensionality of the pretreated data. The second step, the former 9-dimensional data obtained were reduced to a two-dimensional using the laboratory prepared OLDA algorithm code. The final step, the species identification model was established by the SVM algorithm (polynomial kernel).

1.4 Diffuse reflectance experiment

Modeling with data set R1, test for R2~R5, count the correct recognition rate for haploid and polyploid respectively, and averaging. The test result is shown in Table 1.

Table 1 Result of test sets in diffuse reflectance conditions

It can be seen from the result in Table 1, the average recognition rate for maize haploid and polyploid is between 44%~55%, less than 60%. Applying experiment scheme of diffuse reflectance illumination is unable to effectively identify maize haploid and polyploid.

In order to further explore the influence of the maize kernel embryo surface orientation to the recognition results, the following two sets of experiments are designed. The first set of experiments modeling with 35 spectra with kernel embryo facing down in data set R4, test 35 spectra in data set R5 corresponding to the spectra of kernel embryo facing down and kernel embryo facing up, respectively. The second set of experiments modeling with 35 spectra with kernel embryo facing up in data set R4, test 35 spectra in data set R5 corresponding to the spectra of kernel embryo facing up and kernel embryo facing down, respectively. Count the correct recognition rate for haploid and polyploid respectively, and averaging. The test result is shown in Table 2.

Table 2 Result of test sets in diffuse reflectance conditions with embryo surface orientation

Analyzing data in Table 2, the kernel embryo placed facing down, the diffuse reflectance spectra contains information of embryo most, with the recognition rate of 100%; the kernel embryo placed facing up, diffuse reflectance spectra contains less proportion of information of embryo, recognition rate is significantly reduced; Under the worst circumstance (embryo surface orientation of modeling set and testing set are opposite), diffuse reflectance spectra cannot effectively identify haploid and polyploid. analysis results suggest due to the position effect of diffuse reflectance spectra, the essential difference between maize haploid and polyploid of the same variety exists in the kernel interior especially embryo, thus the orientation of maize kernel embryo surface is the main causes of that the diffuse reflectance spectra is unable to accurately identify maize haploid kernels. In order to achieve rapid and automatic sorting of maize haploid kernels without artificial participation, low recognition rate as a result of the orientation of maize kernel embryo surface needs to be solved.

1.5 Diffuse transmittance experiment

Shut the built-in light source of micro spectrometer, use the external light source to illuminate the maize kernel, collect the near infrared diffuse transmittance spectra. To prevent the damage of spectrometer caused by high light direct illumination to the detector, adjust the angle of incidence light to about 45-degree with the kernel. The detector collected are near infrared diffuse reflectance spectra through maize kernel, which carry a large number of information of sample interior, can largely reduce the influence of position effect to analysis. In this experiment, the kernels were placed randomly; orientation of embryo surface was not distinguished.

Use T1 as modeling set, test for T2~T6, count the correct recognition rate for haploid and polyploid respectively, and averaging. The test result is shown in Table 3.

Table 3 Result of test sets in diffuse transmittance conditions

It is observed from data in Table 3, the minimum average recognition rate is 88%, the maximum achieves 98%, the average is 93.2%, i.e. adopting diffuse transmittance method can effectively identify maize haploid and polyploid kernel. In addition, the collection time of modeling data and the collection time of test set data were not completely on the same day. In Table 3, the collection time of modeling data of set T1 was on 26th, the collection time of data used to test set T6 was on 28th, the recognition rate can still achieve 92%, the results show that modeling with diffuse transmittance spectra is of certain time stability, to satisfy practical applications.

Diffuse transmittance spectra collection without distinction of the orientation of maize kernel embryo surface can effectively identify maize haploid seeds, and the model stability is good, which provides technical basis for automatic collection and spectra identification. It takes only 0.25 s for a single spectra collection by miniature near infrared spectrometer; these advantages provide the possibilities for subsequent development of high throughput automatic sorting equipment for maize haploid kernels.

2 The results and discussion

This paper based on NIRS qualitative analysis technology, compared the identification results of maize haploid with diffuse reflectance and transmittance spectra. The experiment results show that regardless of the orientation of kernel embryo surface, using diffuse reflectance spectra cannot identify maize haploid effectively; while using diffuse transmittance spectra can effectively identify the haploid and polyploid, with an average correct recognition rate of 93.2%, and the time stability of the model is preferable. The analysis suggests that, diffuse reflectance spectra mainly contain the material information of the sample surface and shallow, tending to be more influenced by factors of maize kernels such as size, surface morphology, embryo surface orientation, etc., reducing the proportion of information of differences between haploid and polyploid category, increasing the difficulties for maize haploid identification. Diffuse transmittance experiments use an external light source to illuminate maize kernel, the detector collected are near infrared diffuse transmittance spectra through the kernel, which carry more information of differences between haploid and polyploid kernel interior. Therefore, in the circumstance that regardless of orientation, it is still able to effectively identify maize haploid and polyploid kernels.

3 Conclusions

This paper based on MicroNIR-1700 series miniature near infrared spectrometer of JDSU Company, using NIRS qualitative analysis methods, did related research for maize haploid and polyploid identification problems. The study found that the differences between maize kernel haploid and polyploid were mainly in the embryo, and diffuse reflectance spectra carry information of the kernel surface and shallow, therefore, in the circumstance that regardless of orientation, near infrared diffuse reflectance spectra analysis cannot effectively identify haploid, while diffuse transmittance spectra carrying a lot of information, to a great extent overcome the shortage that diffuse reflectance spectra is kernel embryo surface orientation sensitive. Applying the diffuse transmittance analysis method that external light source illuminate maize kernel proposed in this paper, can achieve the average correct recognition rate for haploid and polyploid 93.2%, miniature near infrared spectrometer is of low cost, fast spectra collection speed, simple operation. The near infrared diffuse transmittance spectra qualitative analysis combined with micro near infrared device studied in this paper, is easy to implement high throughput automatic identification system equipment for maize haploid kernels, is of great practical value.

[1] Shi Xiaodong, Gao Runmei. Plant Tissue Cultivation. Beijing: China Agricultural Science and Technology Press,2009.

[2] Cai Zhuo, Xu Guoliang. Journal of Maize Sciences,2008,16(1): 1.

[3] Zhao Yanming, Dong Shuting, Zhang Suoliang, et al. Journal of Maize Sciences,2007, 15(5):60.

[4] Zhang Junxiong, Wu Zhanyuan, Song Peng, et al. Transactions of the Chinese Society of Agricultural Engineering,2013, 29(4):199.

[5] Liu Jin, Guo Tingting. Transactions of the Chinese Society of Agricultural Engineering, 2012, 28(z2): 233.

[6] Lu Wanzhen, Yuan Hongfu, Xu Guangtong, et al. Modern Near Infrared Spectroscopy Analytical Technology(Second Edition). Beijing: China Petrochemical Press, 2007.

[7] Yan Yanlu. Modern Instrumental Analysis(Third Edition). Beijing: China Agricultural University Press,2010.

[8] Zhang Liping, Li Weijun, WANG Ping, et al. Spectroscopy and Spectral Analysis, 2012, 32(10): 2785.

[9] Svante Wold,Michael Sjostroma,et al. Chemometrics and Intelligent Laboratory Systems,2001,58:109.

[10] Duda R O,Hart P E,Stork D G. Pattern Classification. Translated by Li Hongdong,Yao Tianxiang,et al. Beijing:China Machine Press,2003.

[11] Fan Bin,Lei Zhen,et al. Proceedings of 8th IEEE International Conference on Automatic Face & Gesture Recognition,2008. 1.

[12] Zhang Shanwen, Jia Qingjie, Jing Rongzhi. Journal of Anhui Agricultural Sciences, 2012,40(1):9.

*通讯联系人

O657.3; S123

A

基于近红外漫反射与漫透射光谱的玉米单倍体鉴别比较研究

覃 鸿1,马竞一1,2,陈绍江3,严衍禄4,李卫军1*,王 平2,刘 金3

1. 中国科学院半导体研究所高速电路与神经网络实验室,北京 100083 2. 中国石油大学(华东)信息与控制工程学院,山东 青岛 266580 3. 中国农业大学国家玉米改良中心,北京 100193 4. 中国农业大学信息与电气工程学院,北京 100083

使用JDSU公司的MicroNIR1700型微型近红外光谱仪,研究了适合进行单籽粒玉米单倍体鉴别的光谱测量方法。基于近红外光谱定性分析技术,比较了漫反射和漫透射两种情况下玉米单倍体鉴别的效果。光谱数据经过预处理后,采用PLS+OLDA特征提取算法,应用SVM建立玉米单倍体鉴别模型,分别统计漫反射和漫透射实验条件下,鉴别模型的正确识别率。在微型光谱仪内置光源漫反射的光谱测量方式下,不分胚面朝向,玉米单倍体籽粒平均识别率低于60%,不能有效鉴别玉米单倍体和多倍体。而采用外置光源对籽粒进行漫透射光谱测量方式,获得了平均正确识别率为93.2%的鉴别效果,并且模型稳定性好。实验结果表明,漫反射光谱仅能获得玉米籽粒表层信息,因此玉米籽粒胚面朝向严重影响漫反射光谱鉴别单倍体种子的效果;而漫透射光谱可以实现分析光程纵深信息全累加,能够得到样品内部的信息,因此对胚面朝向不敏感,能够有效地对随机摆放的玉米单倍体和多倍体进行识别。近红外方法能快速、无损地鉴别单倍体,并且微型光谱仪采集速度快,成本低,为实现实用化的自动鉴别提供了条件。

近红外光谱;单倍体鉴别;漫透射;漫反射;定性分析

2014-09-23,

2014-12-10)

2014-09-23; accepted:2014-12-10

National Key Scientific Instrument and Equipment Development Project(2014YQ470377), the China Scholarship Council (201404910237)

10.3964/j.issn.1000-0593(2016)01-0292-06

猜你喜欢
单倍体朝向识别率
朝向马头的方向
朝向马头的方向
不同除草剂对玉米单倍体成熟胚的加倍效果
基于类图像处理与向量化的大数据脚本攻击智能检测
基于真耳分析的助听器配戴者言语可懂度指数与言语识别率的关系
乌龟快跑
提升高速公路MTC二次抓拍车牌识别率方案研究
玉米单倍体育性自然恢复研究进展
高速公路机电日常维护中车牌识别率分析系统的应用
微生物学