Abstract
Detecting the topic of documents can help readers construct the background of the topic and facilitate document comprehension. In this paper, we propose a semantic frame-based topic detection (SFTD) that simulates such process in human perception. We take advantage of multiple knowledge sources and extracted discriminative patterns from documents through a highly automated, knowledge-supported frame generation and matching mechanisms. Using a Chinese news corpus containing over 111,000 news articles, we provide a comprehensive performance evaluation which demonstrates that our novel approach can effectively detect the topic of a document by exploiting the syntactic structures, semantic association, and the context within the text. Experimental results show that SFTD is comparable to other well-known topic detection methods.
Similar content being viewed by others
References
Alani H, Kim S, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. Intell Syst IEEE 18(1):14–21
Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search. Addison Wesley, New York
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 1247–1250
Bun KK, Ishizuka M (2002) Topic extraction from news archive using tf*pdf algorithm. In: international conference on web information systems engineering. IEEE Computer Society, p 73
CKIP (2009) An introduction to E-HowNet (E-HowNet technical report). Tech. rep, Academia Sinica
Dong Z, Dong Q, Hao C (2010) HowNet and its computation of meaning. In: Proceedings of the 23rd international conference on computational linguistics: demonstrations, association for computational linguistics, pp 53–56
García-Sánchez F, Martínez-Béjar R, Contreras L, Fernández-Breis JT, Castellanos-Nieves D (2006) An ontology-based intelligent system for recruitment. Exp Syst Appl 31(2):248–263
Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web. ACM, pp 661–670
Hsu W, Chen Y, Wang Y (1998) A context sensitive model for concept understanding. In: Proceeding of 3rd international conference on information theoretic approaches to logic, language, and computation
Lee CS, Jian ZW, Huang LK (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybernet Part B Cybernet 35(5):859–880
Lee CS, Chang YC, Wang MH (2009) Ontological recommendation multi-agent for Tainan city travel. Exp Syst Appl 36(3):6740–6753
Li S, Lv X, Wang T, Shi S (2010) The key technology of topic detection based on k-means. In: 2010 international conference on future information technology and management engineering (FITME), vol 2. IEEE, pp 387–390
Lovász L (1993) Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1):1–46
Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge
Nallapati R, Feng A, Peng F, Allan J (2004) Event threading within news topics. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, pp 446–453
Scott S, Matwin S (1999) Feature engineering for text classification. ICML (Citeseer) 99:379–388
Shih CW, Hsieh YL, Hsu WL (2014) Sense decomposition from e-hownet for word similarity measurement. In: The 3rd IEEE EM-RITE
Tho QT, Hui SC, Fong ACM, Cao TH (2006) Automatic fuzzy ontology generation for semantic web. IEEE Trans Knowl Data Eng 18(6):842–856
Wang MH, Lee CS, Hsieh KL, Hsu CY, Acampora G, Chang CC (2010) Ontology-based multi-agents for intelligent healthcare applications. J Ambient Intell Humaniz Comput 1(2):111–131
Wu Y, Ding Y, Wang X, Xu J (2010) On-line hot topic recommendation using tolerance rough set based topic clustering. J Comput 5(4):549–556
Zhang X, Wang T (2010) Topic tracking with dynamic topic model and topic-based weighting method. J Softw 5(5):482–489
Acknowledgments
This research was supported by the Ministry of Science and Technology of Taiwan under grant MOST 103-3111-Y-001-027.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by C.-S. Lee.
This research was supported by the National Science Council of Taiwan under Grant NSC102-3111-Y-001-012, NSC102-3113-P-001-006 and NSC 102-3114-Y-307-026.
Rights and permissions
About this article
Cite this article
Chang, YC., Hsieh, YL., Chen, CC. et al. A semantic frame-based intelligent agent for topic detection. Soft Comput 21, 391–401 (2017). https://doi.org/10.1007/s00500-015-1695-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1695-4