Gas chromatography-ion mobility spectrometric discrimination of trunk borer infested Platycladus orientalis using a novel topographic segmentation strategy

https://doi.org/10.1016/j.compag.2022.107125Get rights and content

Highlights

  • A novel topographic segmentation-based feature extraction strategy for GC-IMS data.

  • Infestation stages of trunk borer (S. bifasciatus) damaged P. orientalis plants.

  • The potential of GC-IMS for detection of trunk borer in P. orientalis plants.

Abstract

Gas chromatography coupled with ion mobility spectrometry (GC-IMS), a volatile analysis technique, has been widely used in the agricultural field over the past decade. However, complex three-dimensional (3D) fingerprint, including millions of data elements, causes significant challenges for its data process. This study proposed a novel topographic segmentation-based strategy, which could automatically extract the maxima and volume features from individual peaks. Compared with the manual marker selection approach, its efficiency was identified using different classification and prediction models based on trunk borer infested Platycladus orientalis samples. The grid search-support vector machine (GS-SVM) classifier combined with a topographic segmentation strategy could correctly classify at least 93.67% of P. orientalis samples into their corresponding infestation stages. The volume feature performed best, and its cross-validation classification accuracy could reach 97.05%. The PLSR prediction model based on the volume feature got the most satisfactory result, whose Rc2 = 0.9624 and RMSEC = 6.328 in calibration set and Rp2 = 0.9056 and RMSEP = 9.926 in validation set. In a word, our proposed topographic segmentation strategy had enough capability to extract local maxima and volume features of peaks from GC-IMS fingerprints. The GC-IMS-based approach combined with chemometric methods had the potential to discriminate the infestation stages of trunk borer damaged P. orientalis plants.

Introduction

Ion mobility spectrometry (IMS), an analytical technique with high sensitivity for volatile organic compounds (VOCs), was first developed in the 1970s for security purposes (explosives and drugs detection) (Borsdorf and Eiceman, 2006). However, it developed slowly in multi-component samples analysis due to its poor separation. Gas chromatography coupled with ion mobility spectrometry (GC-IMS), as a novel technique, has rapidly developed in the last decade to compensate for its separation capability. Recent applications of GC-IMS focused on the classification and identification of food and agriculture products, such as labeling fraud detecting of Iberian cured ham (Liu et al., 2020) and authentication of adulterated Chinese herbal medicine (Yuan et al., 2019).

At present, gas chromatography–mass spectrometry (GC–MS) and electronic nose (E-nose) are two typical strategies to determine the volatile compound profiles in agriculture (Cagliero et al., 2021). Although GC–MS is considered the gold standard for volatile identification in analytical laboratories, its measuring is time-consuming, and complex sample pre-treatment was necessary for gas sampling. In contrast, the gas samples can be directly used for GC-IMS measuring. The IMS detection is completed within a few milliseconds, which promotes its application for rapid and on-site screening. Most commercial E-noses made up of an array of gas sensors are suitable for numbers of VOCs with low selectivity, and they can only give the mixed response curves instead of specific compounds (Cui et al., 2018). In the GC-IMS fingerprints, every chromatographic peak represents a different analyte. Its visualization data in three-dimensional space, characterized by retention time, drift time, and intensity value, is more intuitive during data analysis.

As for GC-IMS data, each compound is recorded in a three-dimensional (3D) fingerprint characterized by retention time in GC column, drift time in IMS drift tube, and their corresponding intensity value. Analytical data dimension increases significantly by adding multivariate spectrometric measurements compared to classical (2D) analyses (time versus signal intensity), so several conventional 2D data processing methods are not applicable for 3D GC-IMS fingerprints. In addition, every measurement generates millions of data elements due to the short measured interval with millisecond precision. These data characteristics cause significant challenges for the feature extraction of GC-IMS fingerprints.

Until now, the approach for GC-IMS data processing falls mainly into two primary strategies, one is manually picking markers by observing its topographic map (Chen et al., 2021, Leng et al., 2021), and the other is using the complete fingerprint data (Gerhardt et al., 2019, Schwolow et al., 2019). These two strategies have their strengths and weaknesses, and they have also been applied and compared in several studies (Contreras et al., 2019, Zheng et al., 2021a). Picking markers is subjective and time-consuming, but almost all compounds can be found even with weak signals, so this strategy is the most widely used in existing studies. By contrast, the data processing procedure to complete fingerprint is high-efficiency if combined with a proper approach. However, since a complete fingerprint contains large amounts of redundant information, a conventional dimensionality reduction method like principal component analysis (PCA) is unsuitable for two-dimensional GC-IMS data. Hence, an appropriate pre-treatment procedure and feature extraction method are essential.

Inspired by image processing, GC-IMS fingerprints are similar to grey-scale images characterized by retention and drift times in size (x-; y-coordinate) and signal intensity in the gray value of an image. Based on the grey-scale images, the chromatographic peaks of each compound can be separated using image segmentation technologies. It has the advantage of weak-signal separation compared with threshold segmentation due to tailing peaks and noise. After that, some data features of the peak, including maximum value and volume, can be automatically extracted from these separated regions instead of the manual markers selection process.

The segmentation-based strategy was performed on the GC-IMS fingerprints obtained from trunk borer infested Platycladus orientalis with different infestations. Semanotus bifasciatus is the most typical trunk borer of P. orientalis. Early detection of pests can help P. orientalis get timely treatment and stop their further spread throughout the infestation stage. In current solutions for detecting forest wood-boring beetle defects, the drilling resistance method could correlate drilling pressure and wood density (Nowak et al., 2016). However, it could only provide a limited amount of information in a straight line. Stress waves combined with acoustic tomography could be used to quantify the extensive decay of the trunk indirectly (Liu and Li, 2018). However, its measuring precision is limited by the sensor arrangement and multichannel signal processing. Although their feeding behaviors are hard to monitor by mere external appearances, pest feeding tends to cause the volatile response of plants (Gish et al., 2015). Our previous study found that the twigs VOCs of Phloeosinus Haubei adults damaged P. orientalis plants had varying degrees of increase between 1.39-fold and 5.65-fold compared to undamaged plants (Zheng et al., 2021a).

Consequently, the main object of this study is to propose a topographic segmentation-based feature extraction strategy for GC-IMS fingerprints. Individual peak areas from an original fingerprint can be separated using this strategy, replacing the time-consuming manual picking process. Our proposed strategy was tested on the GC-IMS measurements of trunk borer infested P. orientalis samples with different durations. Finally, different chemometrics approaches were employed to evaluate its performance compared to existing strategies applied in GC-IMS data processing.

Section snippets

Samples preparation and treatment

In this study, 20 living P. orientalis trees in Lingyan artificial forests (3616′ N, 1176′ E, Shandong province, China) planted in the 1950s, free of infestation with diameters of 8–12 cm, were selected as sample plants. Three pairs of newly-emerged S. bifasciatus adults (3 males and 3 females) were fixed to plant trunk at about 1 m above ground using a metallic net. The volatile sampling interval was determined according to the different instars of S. bifasciatus larva (Li et al., 2020).

Overview of measurement fingerprints

This work aimed to propose a topographic segmentation-based feature extraction strategy for GC-IMS fingerprints with broad applicability, so biological samples were directly used for method optimization instead of a simple standard mixture. By observing the measurement fingerprints, the target area (RIP < drift time < 15 ms and 100 s < retention time < 1100 s) without non-signal area was intercepted for segmentation processing. The volatile compounds detected in S. bifasciatus-damaged P.

Conclusions

This study proposed a topographic segmentation strategy to extract peak features from the GC-IMS fingerprints. This strategy separated individual peak areas from original fingerprints using a watershed segmentation algorithm and active contour model, extracted features about maximum value and volume from separated chromatographic peaks, then integrated features dataset of all identified markers from different samples into the same size. The results showed that almost all chromatographic peaks

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was funded by the National Natural Science Foundation of China [grant number 31670654]. The authors would also like to acknowledge the Lingyan management committee of Mount Tai management district for providing plant samples and other experiments support.

References (29)

Cited by (0)

View full text