Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine

Shao, Huan; Li, GuoZheng; Liu, GuoPing; Wang, YiQin

doi:10.1007/s11432-011-4406-5

Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine

Research Paper
Published: 02 January 2012

Volume 56, pages 1–13, (2013)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Huan Shao¹,
GuoZheng Li²,
GuoPing Liu³ &
…
YiQin Wang³

399 Accesses
45 Citations
Explore all metrics

Abstract

In traditional Chinese medicine (TCM) diagnosis, a patient may be associated with more than one syndrome tags, and its computer-aided diagnosis is a typical application in the domain of multi-label learning of high-dimensional data. It is common that a great deal of symptoms can occur in traditional Chinese medical diagnosis, which affects the modeling of diagnostic algorithm. Feature selection entails choosing the smallest feature subset of relevant symptoms, and maximizing the generalization performance of the model. At present there are rare researches on feature selection on multi-label data. A hybrid optimization technique is introduced to symptom selection for multi-label data in TCM diagnosis in this paper, and modeling is made by means of four multi-label learning algorithms like k nearest neighbors, etc. We compare the performance of the algorithm with the current popular dimension reduction algorithms like MEFS (embedded feature selection for multi-Label learning), MDDM (multi-label dimensionality reduction via dependence maximization) on the UCI Yeast gene functional data set and an inquiry diagnosis dataset of coronary heart disease (CHD). Experimental results show that the algorithm we present has significantly improved the performance. In particular, the improvement on the average precision for the classifier is up to 10.62% and 14.54%. Syndrome inquiry modeling of CHD in TCM is realized in this paper, providing effective reference for the diagnosis of CHD and analysis of other multi-label data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Medical Diagnosis by Using Machine Learning Techniques

Design and Implement Intelligent Discrimination of TCM Syndromes Based on Multi-label

Prediction of heart disease and classifiers’ sensitivity analysis

Article Open access 02 July 2020

References

Tian L, Yan Y J, Zhu J G. Data mining techniques and their application in TCM study (in Chinese). Chinese J Basic Med Trad Chin Med, 2005, 11: 710–712
Google Scholar
Tsousmakas G, Zhang M L, Zhou Z H. Learning from multi-label data. In: Tutorial at ECML/PKDD’09 Bled, Slovenia, 2009
Google Scholar
Zhang Y, Zhou Z H. Multi-label dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data, 2010, 4(3): Article No. 14
Google Scholar
Yu K, Yu S P, Tresp V. Multi-label informed latent semantic indexing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2005. 258–265
Google Scholar
Ji S W, Ye J P. Linear dimensionality reduction for multi-label classification. In: Proceedings of the 21st International Conference on Artificial Intelligence, Pasadena, CA, 2009. 1077–1082
Google Scholar
Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res, 2003, 3: 1157–1182
MATH Google Scholar
Ge L, Li G Z, You M Y. Embedded feature selection for multi-label learning (in Chinese). J Nanjing Univ (Nat Sci), 2009, 45: 671–676
Google Scholar
Moody J, Utans J. Principled architecture selection for neural networks: Application to corporate bond rating prediction. In: Moody J E, Hanson S J, Lippmann P R, eds. Neural Information Processing Systems 4. San Fransisco, CA: Morgan Kaufmann Publishers, Inc, 1992. 683–690
Google Scholar
Zhang M L, Pena J M, Robles V, et al. Feature selection for multi-label naive Bayes classification. Inf Sci, 2009, 179: 3218–3229
Article MATH Google Scholar
Li G C, Li C T, Huang LP, et al. An investigation into regularity of syndrome classification for chronic atrophic gastritis based on structural equation model (in Chinese). J Nanjing Univ Trad Chin Med, 2006, 22: 217–220
Google Scholar
Wang X W, Qu H B, Wang J. A quantitative diagnostic method based on data-mining approach in TCM (in Chinese). J Beijing Univ Trad Chin Med, 2005, 28: 4–7
Google Scholar
Liu G P, Li G Z, Wang Y L, et al. Modeling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complem Altern Med, 2010, 10: 37
Article Google Scholar
Gheyas I A, Smith L S. Feature subset selection in large dimensionality domains. Patt Recogn, 2010, 43: 5–13
Article MATH Google Scholar
Blickle T, Thiele L. A comparison of selection schemes used in evolutionary algorithms. Evolut Comput, 1996, 4: 361–394
Article Google Scholar
Motoki T. Calculating the expected loss of diversity of selection schemes. Evolut Comput, 2002, 10: 397–422
Article Google Scholar
Sokolov A, Whitley D. Unbiased tournament selection. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation. Washington DC: ACM, 2005. 1131–1138
Chapter Google Scholar
Zhang M L, Zhou Z H. ML-KNN: A lazy learning approach to multi-label learning. Patt Recog, 2007, 40: 2038–2048
Article MATH Google Scholar
Zhang M L, Zhou Z H. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng, 2006, 18: 1338–1351
Article Google Scholar
Elisseeff A, Weston J. A kernel method for multi-labelled classification. Adv Neur Inf Process Syst, 2002, 14: 681–687
Google Scholar
Ronen M, Jacob Z. Using simulated annealing to optimize feature selection problem in marketing applications. Europ J Oper Res, 2006, 171: 842–858
Article MATH Google Scholar
Yang J, Honavar V. Feature subset selection using a genetic algorithm. IEEE Intell Syst Appl, 1998, 13: 44–49
Article Google Scholar
Pudil P, Novovicov J, Kittler J, et al. Floating search methods in feature selection. Patt Recog Lett, 1994, 15: 1119–1125
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering and Science, Shanghai University, Shanghai, 200072, China
Huan Shao
Department of Control Science and Engineering, Key Laboratory of Ministry of Education for Service Computing and Embedded Systems, Tongji University, Shanghai, 201804, China
GuoZheng Li
Laboratory of Information Access and Synthesis of TCM Four Diagnosis, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
GuoPing Liu & YiQin Wang

Authors

Huan Shao
View author publications
You can also search for this author in PubMed Google Scholar
GuoZheng Li
View author publications
You can also search for this author in PubMed Google Scholar
GuoPing Liu
View author publications
You can also search for this author in PubMed Google Scholar
YiQin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to GuoZheng Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shao, H., Li, G., Liu, G. et al. Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Sci. China Inf. Sci. 56, 1–13 (2013). https://doi.org/10.1007/s11432-011-4406-5

Download citation

Received: 18 August 2011
Accepted: 22 November 2011
Published: 02 January 2012
Issue Date: May 2013
DOI: https://doi.org/10.1007/s11432-011-4406-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine

Abstract

Access this article

Similar content being viewed by others

Medical Diagnosis by Using Machine Learning Techniques

Design and Implement Intelligent Discrimination of TCM Syndromes Based on Multi-label

Prediction of heart disease and classifiers’ sensitivity analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine

Abstract

Access this article

Similar content being viewed by others

Medical Diagnosis by Using Machine Learning Techniques

Design and Implement Intelligent Discrimination of TCM Syndromes Based on Multi-label

Prediction of heart disease and classifiers’ sensitivity analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation