An Improvement Approach for Word Tendency Using Decision Tree

Atlam, El-Sayed; Ghada, Elmarhomy; Fuketa, Masao; Morita, Kazuhiro; Aoe, Jun-ichi

doi:10.1007/11554028_84

El-Sayed Atlam²¹,
Elmarhomy Ghada²¹,
Masao Fuketa²¹,
Kazuhiro Morita²¹ &
…
Jun-ichi Aoe²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3684))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1191 Accesses

Abstract

In every text, words have various frequencies and keywords have strong relationship with the subjects of their texts. Word frequencies change due to time-series variation over given periods of time. An early method estimated stability classes that indicate word popularity due to time-series variation based on frequency changes in text data over given periods using a decision tree. The estimation precision of the decision tree decreases when there is scattering of data number among classes. This paper suggests a new way to use a Random Sampling Method and proposes a new Data Copying Method to improve the estimation precision of decision tree. By using this new Data Copying Method, F-measures have improved: Increasing Class 9%; Relatively Constant Class 9%; Decreasing Class 18%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comparative Study Between Feature Selection Algorithms

Using Time Series Analysis for Estimating the Time Stamp of a Text

Modification of a Nonparametric Procedure for Testing the Hypothesis About the Distributions of Random Variables

Article 01 July 2023

References

Atlam, E.-S., Makoto, O., Masami, S., Aoe, J.: An Evaluation Method of Words Tendency Depending on Time–SeriesVariation and its Improvements. Information Processing & Management 8(2), 157–171 (2001)
Google Scholar
Fukumoto, F., Suzuki, Y., Fukumoto, J.I.: An Automatic Clustering of Articles Using Dictionary Definitions. Trans. Of Information Processing Society of Japan 37(10), 1789–1799 (1996)
Google Scholar
Hara, M., Nakajima, H., Kitani, T.: Keyword Extraction Using Text Format and Word Importance in Specific Field. Trans. Of Information Processing Society of Japan 38(2), 299–309 (1997)
Google Scholar
Haruo, K.: Automatic Indexing and Evaluation of Keywords for Japanese Newspaper. Trans. of the Institute of Electronics, Information and Communication Engineering (IEIC) J74-D-I (8), 556–566 (1991)
Google Scholar
Hisano, H.: Page-Type and Time-Series Variations of a Newspaper’s Character Occurrence Rate. Journal of Natural Language Processing 7(2), 45–61 (2000)
Google Scholar
Honda, T., Mochizuki, H., Ho, T.B., Okumura, M.: Generating Decision Trees from an Unbalanced Data Set. In: Proceeding of the 9th European Conference on Machine Learning (1997)
Google Scholar
Liman, J.: Cue Phrase Classification Using Machine Learning. Journal of Artificial Intelligence Research 5, 53–94 (1996)
Google Scholar
Ohkubo, M., Sugizaki, M., Inoue, T., Tanaka, K.: Extracting Information Demand by Analyzing a WWW Search Login. Trans. of Information Processing Society of Japan 39(7), 2250–2258 (1998)
Google Scholar
Okumura, M., Haraguchi, Y., Mochizuki, H.: Some Observation on Automatic Text Summarization Based on Decision Tree Learning. Journal of Information Processing Society of Japan 5N-2, 71–72 (1999)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Salton, G., McGill, M.J.: Introduction of Modern Information Retrieval. McGraw-Hill, New York (1983)
Google Scholar
Swerts, M., Ostendorf, M.: Discourse Prosody in Human-Machine Interaction. In: European Speech Communication Association (ESCA) Workshop on spoken Dialogue Systems, pp. 205–208 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, 770-8506, Japan
El-Sayed Atlam, Elmarhomy Ghada, Masao Fuketa, Kazuhiro Morita & Jun-ichi Aoe

Authors

El-Sayed Atlam
View author publications
You can also search for this author in PubMed Google Scholar
Elmarhomy Ghada
View author publications
You can also search for this author in PubMed Google Scholar
Masao Fuketa
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Morita
View author publications
You can also search for this author in PubMed Google Scholar
Jun-ichi Aoe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Business, La Trobe University, 3086, Melbourne, Victoria, Australia
Rajiv Khosla
Centre for SMART systems Engineering Research Centre, University of Brighton, Moulsecoomb, BN2 4GJ, Brighton, UK
Robert J. Howlett
School of Electrical and Information Engineering, Knowledge Based Intelligent Engineering Systems Centre, University of South Australia, 5095, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Atlam, ES., Ghada, E., Fuketa, M., Morita, K., Aoe, Ji. (2005). An Improvement Approach for Word Tendency Using Decision Tree. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2005. Lecture Notes in Computer Science(), vol 3684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11554028_84

Download citation

DOI: https://doi.org/10.1007/11554028_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28897-8
Online ISBN: 978-3-540-31997-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Improvement Approach for Word Tendency Using Decision Tree

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Comparative Study Between Feature Selection Algorithms

Using Time Series Analysis for Estimating the Time Stamp of a Text

Modification of a Nonparametric Procedure for Testing the Hypothesis About the Distributions of Random Variables

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Improvement Approach for Word Tendency Using Decision Tree

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Comparative Study Between Feature Selection Algorithms

Using Time Series Analysis for Estimating the Time Stamp of a Text

Modification of a Nonparametric Procedure for Testing the Hypothesis About the Distributions of Random Variables

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation