Maize haploid recognition study based on nuclear magnetic resonance spectrum and manifold learning
Introduction
Haploid breeding, which has become one of the major maize breeding techniques, can help maize breeding to get rid of problems like long period, high cost and low efficiency and it is very effective for developing new varieties (Weber, 2014). The primary condition for implementing this technology is to obtain an enough quantity of maize haploid kernels. The probability for maize naturally occurring haploid is 0.05–0.1%, even artificially induced by high-frequency haploid inducer, and the induction rate is 8–15% (Cai et al., 2008, Chalyk and Rotarenco, 2001, Chen and Song, 2003, Dang et al., 2012, Liu and Song, 2000, Prigge et al., 2011, Rober et al., 2005). Therefore, one of the key problems to realize high-throughput commercialization of the haploid breeding technology lies in developing a set of effective haploid recognition system (Dwivedi et al., 2015).
The haploid recognition methods which have been most extensively applied at present are Near Infrared Spectroscopy (NIRS), machine vision and NMR quantitative analysis. NIRS techniques with features of rapid, nondestructive could identify the haploid and Micro-NIR spectrometer scan fast and cost less, which have utility for automatically selecting haploid maize kernels from hybrid kernels(Qin et al., 2016, Li et al., 2018, Lin et al., 2018). However, the NIR spectra of maize haploid kernels are easily affected by many factors, such as light, temperature, humidity, NIR intensity and collecting instrument (Zhou et al., 2007). The machine vision method is based on Najavo marker (Nanda And Chase, 1966), which makes different color features in the embryo between haploid and diploid kernels. Li et al. designed a set of haploid screening system based on machine vision and the success rate to obtain embryo surface-containing pictures reached 90%, corrective haploid recognition rate by the system was 95% (Li et al., 2016). However, this genetic marker method still has certain limitations. First of all, when the induced female parent carries dominant pigment inhibiting genes, then this marked color gene can’t be expressed; secondly, genetic expression effects of different hybridized material combinations are quite different (Zhang et al., 2013, Li et al., 2016). Chen and Song put forward using oil xenia effect for haploid recognition, which makes the induced haploid kernels and diploid kernels by high oil inducer present a significant different in oil content (Chen and Song, 2003). Haploids can be separated out by measuring kernel oil contents using NMR spectrometer. Haploid automatic screening system based on NMR quantitative analysis has been developed so far, which can realize the recognition rate of 4 s per kernel with accuracy reaching 94% (Wang et al., 2016).
The pattern recognition method based on low-field NMR spectrum of maize kernel was used in this paper. When this pattern recognition method is used, it’s unnecessary to calculate oil content, thus saving the weighing link and improving the automatic recognition efficiency; secondly, it’s not necessary to fabricate calibration curve on schedule in order to ensure measurement accuracy of oil contents, thus remitting the operating difficulty; finally, as it doesn’t completely rely on difference of oil contents for haploid recognition, so it can be applicable to maize kernels generated by conventional inducer, which account for the majority of maize varieties. Low-field NMR technology has been widely applied to quality detection of agricultural products in recent years. Santos et al. used low-field H-NMR to detect synthetic emulsions adulterated in the milk at different volume ratios, conducted multivariable data processing and single-variable processing and established 2 classification models to control and classify milk quality (Santos et al., 2016). Roberta et al. used low-field N-HMR to detect longitudinal relaxation time and transverse relaxation time of honeys adulterated with 0–100% high-fruit maize syrups, and found after double-exponential fitting of the detection results that differences of honeys of different adulteration ratios in aspects of pH, color, water content, water activity and ash content were embodied at , indicating that low-field NMR technology could be used to differentiate pure honey from honey adulterated with high-fruit maize syrups (Ribeiro et al., 2014). These studies have provided a theoretical foundation for this paper.
NMR spectrum is a kind of high-dimensional data. The pattern recognition method needs to extract effective information as far as possible so as to realize accurate classification, so effective feature extraction and dimensionality reduction method is an important link in the identification process. The traditional linear dimension reduction methods assume that the data has a global linear structure, and representative methods are principal component analysis (PCA) and linear discriminant analysis (LDA). However, it’s found in practice that many high-dimensional data are distributed on the low-dimensional nonlinear structure embedded into the high-dimensional linear space. The traditional linear dimensionality reduction method can’t effectively maintain this nonlinearity nature. Therefore, kernel method and manifold learning and other nonlinear dimensionality reduction methods have been developed. Kernel method derives from development and application of the Support Vector Machine theory, it maps original data into a higher-dimensional feature space through nonlinear mapping and process post-mapping data using a linear learning algorithm in the new high-dimensional feature space. The primary problem of kernel method is large calculation cost and the dimension reduction effect depends on selection of the kernel function, which needs to be determined usually by experience (Huang, 2018).
With a reference to the concept of topological manifold, manifold learning algorithm assumes that high-dimensional observed are sampled from a potential low-dimensional manifold. The assumed manifold is learned through one explicit or implicit mapping relation, and original data are projected from the surrounding observation space to a low-dimensional embedded space, in which some global or local geometric attributes and internal structures of original data are kept (Huang and Liu, 2007). Due to its non-linear character and structure-preserving mapping, manifold learning algorithms have acquired favorable research achievements and applications in multiple aspects, for instance, face expression image analysis, data visualization, image information retrieval and anomaly detection have become important dimension reduction means in many high-dimensional data analysis processes. The manifold learning method was used in this study for dimension reduction and its performance in processing NMR spectrum of maize kernel was discussed. In addition, most manifold learning algorithms used at present map data of different categories onto the same low-dimensional embedding manifold. However, data of different categories have different features, and the assumption that these data are located on different manifold structures seems more reasonable (Hettiarachchi and Peters, 2015). A new multi-manifold framework was proposed in this paper for recognition of maize haploid kernels. The new framework conducted the recognition by establishing a low-dimensional manifold for each category and using distance to characterize similarity.
To sum up, maize kernels generated by high-oil induction system and conventional induction system were experimented in this paper, the following contents were mainly discussed: the feasibility of the pattern recognition method based on NMR spectrum and combining manifold learning dimension-reduction algorithm in the maize haploid recognition; the effect of the proposed multi-manifold learning framework in the recognition was verified.
Section snippets
Experimental samples
Experimental samples were divided into two parts, both of which were provided by national maize improvement center of China Agricultural University. The experimental materials were generated using the inducer carrying R1-nj gene marker as the male parent to induce common hybrids, where diploid would generate purple marker character at the embryo while haploid was colorless at the embryo because of parthenogenesis. In part one, high oil inducer CHIO3 (oil content: 8.72%) was used as the male
NMR spectrum analysis
NMR spectra of Zhendan 958H generated by high-oil inducer and Zhengdan 958C generated by conventional inducer, which were acquired in the experiment are shown in Fig. 4a and c respectively. X-coordinate represents relaxation time while y-coordinate is signal intensity, and spectral signal is manifested by an attenuation curve. According to Fig. 4a and b, NMR spectra of haploids and diploids generated by high-oil inducer are obviously different in the overall distribution, and this difference
Conclusion
The feasibility of the pattern recognition method combining NMR spectrum and manifold learning dimension reduction algorithm when applied to maize haploid recognition was discussed in this study. Firstly, experimental results verified that the pattern recognition method based on NMR spectrum could be used for haploid recognition, and the recognition rate of the high oil induced kernels could reach as high as 98%; for maize kernels generated by conventional inducer, as the oil content
Acknowledgements
The authors gratefully acknowledge the financial support from the National Key R&D Program of China (Grant No. 2017YFD0701702).
References (27)
- et al.
Haploids: constraints and opportunities in plant breeding
Biotechnol. Adv.
(2015) - et al.
Peach variety identification using near-infrared diffuse reflectance spectroscopy
Comput. Electron. Agric.
(2016) - et al.
Multi-manifold LLE learning in pattern recognition
Pattern Recogn.
(2015) - et al.
Detection of honey adulteration of high fructose corn syrup by Low Field Nuclear Magnetic Resonance (LF H-1 NMR)
J. Food Eng.
(2014) - et al.
Detection and quantification of milk adulteration using time domain nuclear magnetic resonance (TD-NMR)
Microchem. J.
(2016) - et al.
Fully-automated high-throughput NMR system for screening of haploid kernels of maize (corn) by measurement of oil content
PLoS One
(2016) Today's use of haploids in corn plant breeding
- et al.
The advances in haploid breeding of maize
Journal of Maize Sciences
(2008) - et al.
The use of matroclinous maize haploids for recurrent selection
Russ. J. Genet.
(2001) - et al.
Identification haploid with high oil xenia effect in maize
Acta Agron. Sinica
(2003)
Inducer line generated double haploid seeds for combined waxy and opaque 2 grain quality in subtropical maize (Zea mays. L.)
Euphytica
Overview of nonlinear dimensionality reduction methods in manifold learning
Appl. Res. Comput. (China)
Cited by (4)
Discriminant analysis of maize haploid seeds using near-infrared hyperspectral imaging integrated with multivariate methods
2022, Biosystems EngineeringCitation Excerpt :This indicates that the haploid and diploid of the colour marked seeds could not be easily classified by the oil content. The average OCRs of the diploid for TYD1907 and TYD1908 (6.0% and 5.4% respectively) were obviously higher than those of the haploid samples (3.2% and 3.5% respectively), however the partial OCR overlapping of seeds remained, which is consistent with the research of Ge et al. (2020). Compared with the variety TYD1907, TYD1908 exhibits greater overlapping and may adversely affect the detection accuracy.
Hyperspectral imaging combined with generative adversarial network (GAN)-based data augmentation to identify haploid maize kernels
2022, Journal of Food Composition and AnalysisCitation Excerpt :Non-destructive testing is the most promising technology for rapid and non-destructive identification of haploid kernels. For example, some researchers have conducted a series of studies on the identification of haploid maize kernels by using machine vision (Li et al., 2016a; Altuntaş et al., 2019), nuclear magnetic resonance (NMR) (Ge et al., 2020), near-infrared (NIR) spectroscopy (Qin et al., 2016), and hyperspectral imaging (HSI) (Liao et al., 2019). Nevertheless, machine vision is mainly based on the color of the genetic marker on the embryo side of maize kernel to make judgments.
Applying multimodal data fusion based on manifold learning with nuclear magnetic resonance (NMR) and near infrared spectroscopy (NIRS) to maize haploid identification
2021, Biosystems EngineeringCitation Excerpt :However, for the kernels induced by conventional inducer, the overlap between oil content is severe, and the two methods demonstrate significantly lower discrimination. In our previous research, only the NMR spectrum combined with manifold learning algorithm was used for haploid recognition, and the recognition rate was only improved by 5% (Ge et al., 2020). In this study, we propose a data fusion method that can be used to identify haploid through analysis of both NIRS and NMR data.
- 1
These authors contributed equally to this work.