Skip to main content
Log in

A semantic frame-based intelligent agent for topic detection

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Detecting the topic of documents can help readers construct the background of the topic and facilitate document comprehension. In this paper, we propose a semantic frame-based topic detection (SFTD) that simulates such process in human perception. We take advantage of multiple knowledge sources and extracted discriminative patterns from documents through a highly automated, knowledge-supported frame generation and matching mechanisms. Using a Chinese news corpus containing over 111,000 news articles, we provide a comprehensive performance evaluation which demonstrates that our novel approach can effectively detect the topic of a document by exploiting the syntactic structures, semantic association, and the context within the text. Experimental results show that SFTD is comparable to other well-known topic detection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www-nlp.stanford.edu/ner/.

    Fig. 2
    figure 2

    Architecture of named entity ontology

References

  • Alani H, Kim S, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. Intell Syst IEEE 18(1):14–21

    Article  Google Scholar 

  • Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search. Addison Wesley, New York

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 1247–1250

  • Bun KK, Ishizuka M (2002) Topic extraction from news archive using tf*pdf algorithm. In: international conference on web information systems engineering. IEEE Computer Society, p 73

  • CKIP (2009) An introduction to E-HowNet (E-HowNet technical report). Tech. rep, Academia Sinica

  • Dong Z, Dong Q, Hao C (2010) HowNet and its computation of meaning. In: Proceedings of the 23rd international conference on computational linguistics: demonstrations, association for computational linguistics, pp 53–56

  • García-Sánchez F, Martínez-Béjar R, Contreras L, Fernández-Breis JT, Castellanos-Nieves D (2006) An ontology-based intelligent system for recruitment. Exp Syst Appl 31(2):248–263

    Article  Google Scholar 

  • Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web. ACM, pp 661–670

  • Hsu W, Chen Y, Wang Y (1998) A context sensitive model for concept understanding. In: Proceeding of 3rd international conference on information theoretic approaches to logic, language, and computation

  • Lee CS, Jian ZW, Huang LK (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybernet Part B Cybernet 35(5):859–880

    Article  Google Scholar 

  • Lee CS, Chang YC, Wang MH (2009) Ontological recommendation multi-agent for Tainan city travel. Exp Syst Appl 36(3):6740–6753

    Article  Google Scholar 

  • Li S, Lv X, Wang T, Shi S (2010) The key technology of topic detection based on k-means. In: 2010 international conference on future information technology and management engineering (FITME), vol 2. IEEE, pp 387–390

  • Lovász L (1993) Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1):1–46

  • Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge

    MATH  Google Scholar 

  • Nallapati R, Feng A, Peng F, Allan J (2004) Event threading within news topics. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, pp 446–453

  • Scott S, Matwin S (1999) Feature engineering for text classification. ICML (Citeseer) 99:379–388

  • Shih CW, Hsieh YL, Hsu WL (2014) Sense decomposition from e-hownet for word similarity measurement. In: The 3rd IEEE EM-RITE

  • Tho QT, Hui SC, Fong ACM, Cao TH (2006) Automatic fuzzy ontology generation for semantic web. IEEE Trans Knowl Data Eng 18(6):842–856

    Article  Google Scholar 

  • Wang MH, Lee CS, Hsieh KL, Hsu CY, Acampora G, Chang CC (2010) Ontology-based multi-agents for intelligent healthcare applications. J Ambient Intell Humaniz Comput 1(2):111–131

    Article  Google Scholar 

  • Wu Y, Ding Y, Wang X, Xu J (2010) On-line hot topic recommendation using tolerance rough set based topic clustering. J Comput 5(4):549–556

  • Zhang X, Wang T (2010) Topic tracking with dynamic topic model and topic-based weighting method. J Softw 5(5):482–489

    Google Scholar 

Download references

Acknowledgments

This research was supported by the Ministry of Science and Technology of Taiwan under grant MOST 103-3111-Y-001-027.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yung-Chun Chang.

Additional information

Communicated by C.-S. Lee.

This research was supported by the National Science Council of Taiwan under Grant NSC102-3111-Y-001-012, NSC102-3113-P-001-006 and NSC 102-3114-Y-307-026.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, YC., Hsieh, YL., Chen, CC. et al. A semantic frame-based intelligent agent for topic detection. Soft Comput 21, 391–401 (2017). https://doi.org/10.1007/s00500-015-1695-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1695-4

Keywords

Navigation