Precise plant classification within genus level based on simulated annealing aided cloud classifier

https://doi.org/10.1016/j.eswa.2010.08.090Get rights and content

Abstract

This is a series research on plant numerical taxonomy, which provides a precise classification method for the description, discrimination, and review of proposals for new or revised plant species to be recognized as taxon units within the genus level. We firstly used all the available quantitative attributes to build cloud models for different sections. Then, the shortest path based simulated annealing algorithm (SPSA) was applied for optimizing these models. After these, the optimized models were validated by the previously used quantitative attribute data. Results showed that cloud models’ accuracy rates of Sect. Tuberculata, Sect. Oleifera and Sect. Paracamellia were 85.00%, 60.00%, 80.00%. And we found some interesting overlaps between the type species and ‘expected species’ that the selected expected species Camellia oleifera and Camellia brevistyla are also type species of Sect. Oleifera and Sect. Paracamellia, respectively. Here we suggest that the expected species be served as an illustration in plant numerical taxonomy. Based on the simulated annealing aided cloud classifier, the taxon hedges, associated with ‘expected species’, were setting to advance our common understanding of sections and improve our capability to recognize and discriminate plant species. These procedures provide a dynamic and practical way to publish new or revised descriptions of species and sections.

Research highlights

► Series application research of improved cloud classifier algorithm. ► Annealing algorithm combined with the cloud classifier. ► Simulated annealing to minimize the edge distances.

Introduction

Cloud theory is now a popular theory handling uncertainty based on the uncertain transition between qualitative concept and quantitative description (Li et al., 1998, Li et al., 1997, Li et al., 1998). Based on this theory, the cloud classifier has been developed recent years for adaptive linguistic hedge (Lu, Pi, Peng, Wang, & Zhang, 2009). This classifier represents a qualitative concept with three digital characteristics, expected value Ex, entropy σ and deviation D (Di et al., 1998a, Di et al., 1998b), which integrates the fuzziness and randomness of a linguistic term in a unified way. Our previously work (Lu et al., 2009) applied this classifier in the plant numerical taxonomy. In which, the particle swarm optimization algorithm (PSO) was used for optimizing the sections’ hedges. However, there is still some potential improvement left for us as the accurate rates is not high enough. In this work, we will apply the shortest path based simulated annealing algorithm (SPSA) in optimizing the cloud classifier.

SPSA is a new kind of dynamic multi-stage facility layout problem under dynamic business environment, in which new species may be added into, or old species may be removed from their original taxa. Since every section (or genus) has its own range, the distances of species to their ‘expected species’ (Lu et al., 2009) in each section (or genus) should get the global minimum values before the best classification being gained (Pi et al., 2009). Then, the hedges problem could be converted into a shortest path problem by studying its distance function and species adding/removing heuristic rules, and the corresponding mathematical model established for this problem (Dong, Wu, & Hou, 2009). Hence, the SPSA may have good performances in optimizing the cloud classifier.

In this research, the shortest path based simulated annealing aided cloud classifier (SPSACM) method is used for plant classification by analyzing leaf morphology and anatomy data which is partly from our previous work (Lin et al., 2008, Lu et al., 2008). Our purpose is to provide a basic tool in plant taxonomy.

Section snippets

Materials

Adult leaves fully exposed to sunlight of plants of the genus Camellia are collected from the International Camellia Species Garden in Jinhua city, including 15 species in Section Paracamellia Sealy: Camellia grijsii Hance, Camellia confuse Craib, Camellia kissi Wall., Camellia fluviatilis Hand.-Mazz., Camellia brevistyla Coh. St., Camellia hiemalis Nakai, Camellia obtusifolia Chang, Camellia maliflora Lindl, Camellia shensiensis Chang, Camellia puniceiflora Chang, Camellia tenii Sealy,

Different classification results based on different attributes

Fig. 2A shows a 3-D (based on three selective attributes) distribution of original data. After the classification procedure, Fig. 2B and C shows visible differences in the cloud models with different attribute base. Fig. 2D displays all available linguistic atoms generated by a series of linguistic atom generators in this research. These generators have different rules for analysis of quantity and qualitative data.

Cloud models in Fig. 2B, which are based on attributes with small weights (CMSW)

Discussion

In the plant numerical taxonomy, floras are used as most comprehensive tools for people to identify and distinguish plants (Brach & Song, 2005). Recent years, floras of large scope have been written by collaboration of many authors who collectively have examined thousands of plant samples and evaluated and incorporated information from dozens, or even hundreds, of publications (Wen, 1994). For the botanists edited these floras, two primary issues must be well considered: I. How should the

Conclusion

The proposed SPSACM method, based on attribute similarity, is extended from the cloud model and simulated annealing algorithm. We have firstly demonstrated by experiments that the taxonomic results based on the SPSACM method have shown the superiority performance over some related methods.

Then, the weight values of attributes are highly commended in establishing of flora keys.

Besides, we propose again the new nomenclature ‘expected species’: an included species that has the minimum sum of

Acknowledgements

The authors would like to thank Y.F. Huang and L.J. Ma for substantial help in data collection. Funding of Innovation Fund for the Master’s Academe of Zhejiang Normal University is also gratefully acknowledged.

References (26)

  • K.C. Di et al.

    Knowledge representation and discovery in spatial databases based on cloud theory

    International Archives of Photogrammetry and Remote Sensing

    (1998)
  • K.C. Di et al.

    Intelligent query in spatial databases based on cloud model

  • Heidorn, P. B. (2001). A tool for multipurpose use of online flora and fauna: The biological information browsing...
  • Cited by (6)

    View full text