Skip to main content

An Improvement Approach for Word Tendency Using Decision Tree

  • Conference paper
Knowledge-Based Intelligent Information and Engineering Systems (KES 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3684))

  • 1191 Accesses

Abstract

In every text, words have various frequencies and keywords have strong relationship with the subjects of their texts. Word frequencies change due to time-series variation over given periods of time. An early method estimated stability classes that indicate word popularity due to time-series variation based on frequency changes in text data over given periods using a decision tree. The estimation precision of the decision tree decreases when there is scattering of data number among classes. This paper suggests a new way to use a Random Sampling Method and proposes a new Data Copying Method to improve the estimation precision of decision tree. By using this new Data Copying Method, F-measures have improved: Increasing Class 9%; Relatively Constant Class 9%; Decreasing Class 18%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Atlam, E.-S., Makoto, O., Masami, S., Aoe, J.: An Evaluation Method of Words Tendency Depending on Time–SeriesVariation and its Improvements. Information Processing & Management 8(2), 157–171 (2001)

    Google Scholar 

  2. Fukumoto, F., Suzuki, Y., Fukumoto, J.I.: An Automatic Clustering of Articles Using Dictionary Definitions. Trans. Of Information Processing Society of Japan 37(10), 1789–1799 (1996)

    Google Scholar 

  3. Hara, M., Nakajima, H., Kitani, T.: Keyword Extraction Using Text Format and Word Importance in Specific Field. Trans. Of Information Processing Society of Japan 38(2), 299–309 (1997)

    Google Scholar 

  4. Haruo, K.: Automatic Indexing and Evaluation of Keywords for Japanese Newspaper. Trans. of the Institute of Electronics, Information and Communication Engineering (IEIC) J74-D-I (8), 556–566 (1991)

    Google Scholar 

  5. Hisano, H.: Page-Type and Time-Series Variations of a Newspaper’s Character Occurrence Rate. Journal of Natural Language Processing 7(2), 45–61 (2000)

    Google Scholar 

  6. Honda, T., Mochizuki, H., Ho, T.B., Okumura, M.: Generating Decision Trees from an Unbalanced Data Set. In: Proceeding of the 9th European Conference on Machine Learning (1997)

    Google Scholar 

  7. Liman, J.: Cue Phrase Classification Using Machine Learning. Journal of Artificial Intelligence Research 5, 53–94 (1996)

    Google Scholar 

  8. Ohkubo, M., Sugizaki, M., Inoue, T., Tanaka, K.: Extracting Information Demand by Analyzing a WWW Search Login. Trans. of Information Processing Society of Japan 39(7), 2250–2258 (1998)

    Google Scholar 

  9. Okumura, M., Haraguchi, Y., Mochizuki, H.: Some Observation on Automatic Text Summarization Based on Decision Tree Learning. Journal of Information Processing Society of Japan 5N-2, 71–72 (1999)

    Google Scholar 

  10. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  11. Salton, G., McGill, M.J.: Introduction of Modern Information Retrieval. McGraw-Hill, New York (1983)

    Google Scholar 

  12. Swerts, M., Ostendorf, M.: Discourse Prosody in Human-Machine Interaction. In: European Speech Communication Association (ESCA) Workshop on spoken Dialogue Systems, pp. 205–208 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Atlam, ES., Ghada, E., Fuketa, M., Morita, K., Aoe, Ji. (2005). An Improvement Approach for Word Tendency Using Decision Tree. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2005. Lecture Notes in Computer Science(), vol 3684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11554028_84

Download citation

  • DOI: https://doi.org/10.1007/11554028_84

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28897-8

  • Online ISBN: 978-3-540-31997-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics