Skip to main content
Log in

Personalized e-news monitoring agent system for tracking user-interested Chinese news events

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Numerous paper-based newspapers have been transformed into a digital format and published on the Internet. Digital newspapers are gradually becoming a popular electronic media for conveying information immediately. Google developed a powerful news service, Google news alert, based on the Google news aggregator for tracking user-interested new events utilizing a keywords matching approach. However, this service only monitors and tracks news events using the keyword-matching scheme; consequently, the Google news alert retrieves many irrelevant news events and sends them to users. In other words, the current service cannot monitor news events via a specific news topic; although recall rate is high, the precision rate is low when tracking user-interested news events. Thus, this study presents a novel personalized e-news monitoring agent system that employs the topic-tracking-based approach, improving the flaw of the keyword-based approach, for tracking user-interested news events on Google News site. The proposed scheme simultaneously considers both similarities and the semantic relationships among news topics to track news events. Additionally, to further support the promotion of the accuracy rate in tracking user-interested Chinese news events, the Chinese word segmentation system ECScanner (An Extension Chinese Lexicon Scanner) with new word extension is proposed for the Chinese word segmentation process. Experimental results demonstrated that the proposed scheme, based on topic-based approach, is superior to the keyword-based approach used by Google news alert in terms of precision rate, and retains a high recall rate when tracking user-interested news events. Compared with the conventional Chinese word segmentation system CKIP (Chinese Knowledge Information Processing), experimental results also confirmed that using the proposed ECScanner with novel extension mechanism for new words improves the accuracy rate in tracking user-interested news events.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cheung P-S, Huang R, Lam W (2004) Financial activity mining from online multilingual news. In: The international conference on information technology: coding and computing

  2. Fung GPC, Yu JX, Lam W (2003) Stock prediction: integrating text mining approach using real-time news. In: IEEE international conference on computational intelligence for financial engineering, pp 395–402

  3. Mittermayer M-A (2004) Forecasting intraday stock price trends with text mining techniques. In: The 37th Hawaii international conference on system sciences, pp 1–10

  4. Wiithrich B, Permunetilleke D, Leung S, Cho V, Zhang J, Lam W (1998) Daily prediction of major stock indices from textual www data. In: Proceedings of the 4th international conference on knowledge discovery and data mining, KDD-98

  5. Fawcett T, Provost F (1999) Activity monitoring: noticing interesting changes in behavior. In: Chaudhuri, Madigan (eds) Proceedings on the fifth ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, pp 53–62

  6. Wuthrich B et al (1998) Daily stock market forecast from textual web data. In: IEEE International conference on systems, man, and cybernetics, pp 1–6

  7. Peramunetilleke D, Wong RK (2002) Currency exchange rate forecasting from news headlines. In: Proceedings of the thirteenth Australasian database conference

  8. Nesbitt KV, Barrass S (2004) Finding trading patterns in stock market data. IEEE Comput Graph Appl 24(5):45–55

    Article  Google Scholar 

  9. Kuo RJ, Chen CH, Hwang YC (2001) An intelligent stock trading decision support system through integration of genetic algorithm based fuzzy neural network and artificial neural network. Fuzzy Sets Syst 118(1):21–45

    Article  MathSciNet  Google Scholar 

  10. Shan NA, Elbahesh EM (2004) Topic-based clustering of news articles. In: Proceedings of the 42th annual southeast regional conference, pp 412–413

  11. Maria N, Silva MJ (2000) Theme-based retrieval of web news. In: SIGIR, July 2000, pp 354–356

  12. Kurtz AJ, Mostafa J (2003) Topic detection and interest tracking in a dynamic online news source. In: Proceedings of the 2003 joint conference on digital libraries

  13. Lam W, Cheung P-S, Huang R (2004) Mining events and new name translations from online daily news. In: Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries, pp 287–295

  14. Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: SIGIR, pp 37–45

  15. Lee C-S, Jian Z-W, Huang L-K (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybern Part B: Cybern 35(5):859–880

    Article  Google Scholar 

  16. Michael JAB, Gordon L (2004) Data mining techniques for marketing, sales, and customer relationship management. Indianapolis, Wiley

    Google Scholar 

  17. Google alerts. Web available at http://www.google.com/press/descriptions.html#alerts

  18. Foo S, Li H (2004) Chinese word segmentation and its effect on information retrieval. Inf Process Manag 40:161–190

    Article  Google Scholar 

  19. Chinese knowledge information processing (CKIP). Web available at http://140.109.19.112/

  20. Ma W-Y, Chen K-J (2003) Introduction to CKIP Chinese word segmentation system for the first international Chinese word segmentation bakeoff. In: Proceedings of ACL, second SIGHAN workshop on Chinese language processing, pp 168–171

  21. ECScanner (An Extension Chinese Lexicon Scanner). Web available at http://dlll.nccu.edu.tw/~rank/ecscanner/

  22. Google news. Web available from: http://www.google.com/press/descriptions.html#news

  23. Google advanced search. Web available at http://www.google.com/press/descriptions.html#special

  24. Caglayan A, Harrison C (1997) Agent sourcebook: a practical guide to introducing agent technology into your business applications. New York, Wiley

    Google Scholar 

  25. Yeh CL, Lee HJ (1991) Rule-based word identification for mandarin Chinese sentences—a unification approach. Comput Process Chin Oriental Lang 5:97–118

    Google Scholar 

  26. Zhang M-Y, Lu Z-D, Zou C-Y (2004) A Chinese word segmentation based on language situation in processing ambiguous words. Inf Sci 162(3–4):275–285

    Article  MATH  Google Scholar 

  27. Chen KJ, Liu SH (1992) Word identification for mandarin Chinese sentences. In: Proceedings of COLING, pp 101–107

  28. Dee HM (1985) Introduction to natural language processing. Va.Reston, Reston

    Google Scholar 

  29. Huang CR, Chen KJ, Chang LL (1997) Segmentation standard for Chinese natural language processing. Int J Comput Linguist Chin Lang Process 2(2):47–62

    Google Scholar 

  30. He S, Zhu J (2000) A bootstrap method for Chinese new words extraction. IEEE Int Conf Acoust Speech, Signal Process 1(7–11):581–584

    Google Scholar 

  31. Nie JY, Brisebois M, Ren XB (1996) On Chinese text retrieval. In: Proceedings of SIGIR’96, pp 225–233

  32. Wu ZM, Tseng G (1993) Chinese text segmentation for text retrieval: achievements and problems. J Am Soc Inf Sci 44(9):532–542

    Article  Google Scholar 

  33. Wu ZM, Tseng G (1995) ACTS: an automatic Chinese text segmentation system for full text retrieval. J Am Soc Inf Sci 46(2):83–96

    Article  Google Scholar 

  34. Chowdhury GG (2004) Introduction to modern information retrieval Facet, London

    Google Scholar 

  35. CScanner (A Chinese Lexicon Scanner). Web available at http://technology.chtsai.org/cscanner/

  36. Department of Chinese Literature of National Chengchi University. Web available at http://www.chinese.nccu.edu.tw/english/english06/index.htm

  37. Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1):143–175

    Article  MATH  Google Scholar 

  38. Taiwan version of Google news. Web available at http://news.google.com.tw/

  39. Chen KJ, Ma WY (2002) Unknown word extraction for Chinese documents. In: Proceedings of COLING, pp 169–175

  40. Chinese word lexicon. Web available at http://www.aclclp.org.tw/use_rlssd_c.php

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chih-Ming Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, CM., Liu, CY. Personalized e-news monitoring agent system for tracking user-interested Chinese news events. Appl Intell 30, 121–141 (2009). https://doi.org/10.1007/s10489-007-0106-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-007-0106-7

Keywords

Navigation