Skip to main content

A Method on Chinese Thesauri

  • Conference paper
  • First Online:
Book cover Collaborate Computing: Networking, Applications and Worksharing (CollaborateCom 2016)

Abstract

In recent years, text analysis has become increasingly heated in many fields. And now, majority methods of text analysis are using Word2vec, Naïve Bayes or so on to classify the large number of texts. But for the text itself, not all samples are useful for some high-requirement researches and only use one keywords to get the related sample is definitely not enough. In this paper, we provide a novel model of second text filtering with Chinese Thesauri. It includes roughly 5 steps: sample collecting, thesauri establishment, word-segment algorithm, word-frequency statistics and the calculation of text relevance. Its main purpose is making the sample texts more accurate with the keywords which are input by the user and avoiding the needless time and space waste.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jing, Y., Crof, W.B.: An Association Thesauri for Information Retrieval (1994)

    Google Scholar 

  2. Mihalcea, R., Corley, C.: Corpus-based and Knowledge-based Measures of Text Semantic Similarity (2006)

    Google Scholar 

  3. Tausczik, Y.R., Pennebaker, J.W.: The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods (2010)

    Google Scholar 

  4. Scott, S., Matwin, S.: Text Classification Using WordNet Hypernyms (1998)

    Google Scholar 

  5. Roberts, C.W.: Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcript. Lawrence Erlbaum Associates, Mahwah (1997)

    Google Scholar 

  6. Lacity, M.C., Janson, M.A.: Understanding qualitative data: a framework of text analysis methods. J. Manage. Inf. Syst. 11(2), 137–155 (1994)

    Article  Google Scholar 

  7. Stone, P.J.: Thematic text analysis: new agendas for analyzing text content. In: Roberts, C. (ed.) Text Analysis for the Social Sciences. Lawrence Erlbaum Associates, Mahwah (1997)

    Google Scholar 

  8. Lehnert, W., Sundheim, B.: A Performance Evaluation of Text-Analysis Technologies. www.aaai.org

  9. Soergel, D.: Indexing languages and thesauri: construction and maintenance (1974). www.dsoergel.com

  10. Wang, Y.-C., Vandendorpe, J., Evens, M.: Relational thesauri in information retrieval. J. Am. Soc. Inf. Sci. 36(1), 15–27 (1985). America

    Article  Google Scholar 

  11. Larsen, H.L., Yager, R.R.: The use of fuzzy relational thesauri for classificatory problem solving in information retrieval and expert systems. IEEE Trans. Syst. Man Cybern. 23(1), 31–41 (2002)

    Article  MATH  Google Scholar 

  12. Budanitsky, A., Hirst, G.: Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures (2001)

    Google Scholar 

Download references

Acknowledgements

The research was supported in part by the National Science Foundation of China under No.61672104, 61170209, 61502038,U1509214;Program for New Century Excellent Talents in University No.NCET-13-0676. Key Program of BFSU 2011 Collaborative Innovation Center No.BFSU2011-ZD04.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fu Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Chen, F., Liu, X., Xu, Y., Xu, M., Shi, G. (2017). A Method on Chinese Thesauri. In: Wang, S., Zhou, A. (eds) Collaborate Computing: Networking, Applications and Worksharing. CollaborateCom 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 201. Springer, Cham. https://doi.org/10.1007/978-3-319-59288-6_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59288-6_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59287-9

  • Online ISBN: 978-3-319-59288-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics