Abstract
Chinese materia medica resource survey provides an important basis for the development of traditional Chinese Medicine (TCM) industry. During the Chinese materia medica resource survey process, millions of materia medica plant images are collected. The collected image dataset includes some images that are unqualified for image analysis, i.e. they can’t be used to build medicinal plant classifier model. It is a burdensome work to identify the unqualified Chinese materia medica resource images manually. How to screen the unqualified images automatically is an important task of Chinese materia medica resource survey. Image recognition techniques developed quickly in recent years. Outlier detection is a kind of unsupervised method to find the unqualified images automatically. Lots of research work has been done on the topic. Extracted features and correlation metric play important roles on the outlier image detection result. For improving the image screening performance, a novel outlier detection method is proposed in this paper. Convolutional neural network (CNN) is used to extract the complicated features of Chinese materia medica resource images. Extended entropy is introduced into the calculation of information loss that is used to measure the distance between images. Based on the extracted image features and correlation metric, a novel outlier detection method based on clustering is proposed here. The efficiency of the screening method is illustrated with a practical example.





Similar content being viewed by others
References
Abid A, Kachouri A, Mahfoudhi A (2017) Outlier detection for wireless sensor networks using density-based clustering approach. IET Wireless Sens Syst 7(4):83–90
Bakon M, Irene O, Daniele P, Sousa J, Papco J, Data Mining A (2017) Approach for Multivariate Outlier Detection in Postprocessing of Multitemporal InSAR Results. IEEE J Sel Top Appl Earth Obs Remote Sens 10(6):2791–2798
Boriah S, Chandola V, Kumar V (2008) Similarity Measures for Categorical Data: A Comparative Evaluation. Proceedings of the 8th SIAM International Conference on Data Mining 243–254
Breunig MM (2015) LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2):93–104
Jin T, Lou J, Zhou Z (2012) Extraction of Landmine Features Using a Forward-Looking Ground-Penetrating Radar With MIMO Array. IEEE Trans Geosci Remote Sens 50(10):4135–4144
Kang XD, Li ST, Benediktsson JA (2014) Feature extraction of hyperspectral images with image fusion and recursive Filtering. IEEE Trans Geosci Remote Sens 52:3742–3752
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105
Kuncheva LI, Faithfull WJ, Feature Extraction PCA (2014) for Change Detection in Multidimensional Unlabeled Data. IEEE Trans Neural Netw Learn Syst 25(1):69–80
Lajevardi SM, Hussain ZM (2014) Novel higher-order local autocorrelation-like feature extraction methodology for facial expression recognition. IET Image Process 4:114–119
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Liang J, Wang M, Chai Z, Different QW (2014) lighting processing and feature extraction methods for efficient face recognition. IET Image Process 8(9):528–538
Lunga D, Prasad S, Crawford MM, Ersoy O (2014) Manifold-learning-based feature extraction for classification of hyperspectral Data: a review of advances in manifold learning. IEEE Signal Process Mag 31:55–66
Niu ZX, Shi SP, Sun JY, He X (2011) A Survey of Outlier Detection Methodologies and Their Applications. International Conference on Artificial Intelligence and Computational Intelligence 380–387
Rahmani M, George K (2017) Randomized Robust Subspace Recovery and Outlier Detection for High Dimensional Data Matrices. IEEE Trans Signal Process 65(6):1580–1594
Szegedy C, Liu W, Jia YQ, Sermanet P et al (2014) Going Deeper with Convolutions. CoRR arXiv:1409.4842
Tang H, Chen C, Pei X (2016) Visual Saliency Detection via Sparse Residual and Outlier Detection. IEEE Signal Process Lett 23(12):1736–1740
Tishby N, Fernando C, Bialek W (1999) The information bottleneck method. The 37th Annual Allerton Conference on Communication, Control and Computing 1–11
Wang D, Nie FP, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27(10):2743–2755
Zhao Z, Wang L, Liu H, Ye JP (2013) On similarity preserving feature selection. IEEE Trans Knowl Data Eng 25(3):619–632
Zhou CE, Lin DY, Yang XM, Lai XM (2008) Database of Traditional Chinese Medicinal herbs: A bridge between TCM and modern science. IEEE International Symposium on IT in Medicine and Education 773–776
Zhu L, Qiu YY, Yu S, Yuan S (2017) A fast KNN-based MST outlier detection method. Chin J Comput 40(139):1–16
Acknowledgements
This work is partially supported by the Shandong science and technology development plan (Grant No. 2016GGC01061, 2016GGX101029), Natural Science Foundation of Shandong Province (Grant No.ZR2015JL023 and Grant No.ZR2015FL025).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, X., Sun, Z. & Li, Z. Chinese materia medica resource images screening method study. Multimed Tools Appl 77, 22771–22786 (2018). https://doi.org/10.1007/s11042-017-5501-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5501-4