skip to main content
10.1145/3192975.3193011acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaeConference Proceedingsconference-collections
research-article

Improving Accuracy in Thai Sign and Symptom Classification using Context-Free Grammar Approach

Authors Info & Claims
Published:24 February 2018Publication History

ABSTRACT

We examine our proposed word separator for Thai script called two-level tokenization (2LT) by applying this tokenizer to medical Thai script including chief complaints, ICD-10 descriptions. We verify the results of tokenization through the machine learning-based classification. The experimental result shows that the proposed tokenizer works well for Classification and Regression Trees (CART) method with an 85% of precision and 71% of recall. While the F1 score is also 76%. However these values are not high enough to make the proposed tokenizer worthwhile. This paper presents how to improve the results of Thai sign and symptom classification. To increase the precision, recall, and F1 score we adapt context-free grammar (CFG) concept to eliminate the unnecessary some conjunction words which are a common word from the consideration of experimental results. Consequently the precision, recall, and F1 score change from 85%, 71%, and 76% to 93%, 86%, and 89% respectively, this shows that applying CFG can be exploited to yield a higher accuracy than the previous experimental results without applying the CFG concept.

References

  1. Y. Poowarawan, "Dictionary-based thai syllable separation," in Proc. Ninth Electronics Engineering Conference (EECON-86), Bangkok, 1986, pp. 409--418.Google ScholarGoogle Scholar
  2. V. Sornlertlamvanich, "Word segmentation for Thai in machine translation system," Machine Translation, 1993.Google ScholarGoogle Scholar
  3. S. Meknavin, P. Charoenpornsawat and B. Kijsirikul, "Feature-based Thai word segmentation," in proceedings of the natural language processing Pacific Rim Symposium 1997, Bangkok, National Electronics and Computer Technology Center, 1997, pp. 41--47.Google ScholarGoogle Scholar
  4. K. Kosawat, M. Boriboon, P. Chootrakool, A. Chotimongkol, S. Klaithin, S. Kongyoung, K. Kriengket, S. Phaholphinyo, S. Purodakananda, T. Thanakulwarapas and C. Wutiwiwatchai, "BEST 2009: Thai word segmentation software contest," in 2009 Eighth International Symposium on Natural Language Processing, Bangkok, 2009.Google ScholarGoogle Scholar
  5. NECTEC, "LexTo Thai Lexeme Tokenizer," 2016. {Online}. Available: http://www.sansarn.com/lexto/. {Accessed 10 October 2017}.Google ScholarGoogle Scholar
  6. Mosby, "Mosby's Medical Dictionary 9th Edition," Elsevier, Amsterdam, 2012.Google ScholarGoogle Scholar
  7. WHO, "International Statistical Classification of Diseases and Related Health Problems 10th Revision," 2016. {Online}. Available:http://apps.who.int/classifications/icd10/browse/2016/en. {Accessed 15 August 2017}.Google ScholarGoogle Scholar
  8. Thai-Health-Coding-Center, "ICD-10-TM Online," 2016. {Online}. {Accessed 17 September 2017}.Google ScholarGoogle Scholar
  9. P. Saeku and J. Duangsuwan, "Signs and Symptoms Tagging for Thai Chief Complaints Based on ICD-10," in ICACS '17 2017 International Conference on Algorithms, Computing and Systems, Jeju Island, Republic of Korea, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. ThaiNurseClub, "Patient Interviewing & History Taking," 2013. {Online}. Available: http://thainurseclub.blogspot.com. {Accessed 12 May 2017}. (in Thai script)Google ScholarGoogle Scholar
  11. ศราวธ อยเกษม, "Chief Complaint," 2011. {Online}. Available: https://www.gotoknow.org/posts/402169. {Accessed 12 May 2017}. (in Thai script)Google ScholarGoogle Scholar
  12. scikit-learn, "Machine learning in Python," 2010. {Online}. Available: http://scikit-learn.org/stable/. {Accessed 1 June 2017}.Google ScholarGoogle Scholar
  13. J. Duangsuwan and P. Saeku, "Semi-automatic classification based on ICD code for Thai text-based chief complaint by machine learning techniques," International Journal of Future Computer and Communication, 2018. (in press)Google ScholarGoogle Scholar

Index Terms

  1. Improving Accuracy in Thai Sign and Symptom Classification using Context-Free Grammar Approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCAE 2018: Proceedings of the 2018 10th International Conference on Computer and Automation Engineering
      February 2018
      260 pages
      ISBN:9781450364102
      DOI:10.1145/3192975

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 February 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader