Abstract
In previous work, we constructed a Key Term Concurrence Network (KTCN) based on large-scale corpus with an attempt to apply weighted shortest path length to measure semantic relevance between terms. The parameter was tentatively used for query expansion in Information Retrieval task directed to complex user query expressed in natural language. The data obtained from the experiment demonstrated improved performance in the task. However, we also found that as more new expanded terms are appended to the vector of original query, the performance decreases drastically after reaching a peak. This paper respectively explains the causes of this phenomenon from two perspectives: the property of complex network property and corpus linguistics. Based on this conclusion, future work is directed towards how to improve our work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Erkan, G., Radev, D.R.: LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Antiqueira, L., Nunes, M.G., Oliveira, J.O., et al.: Strong correlations between text quality and complex networks features. Physica A: Statistical Mechanics and its Applications 373, 811–820 (2007)
Pardo, T.A., Antiqueira, L., Nunes, M.G., et al.: Using complex networks for language processing: The case of summary evaluation. In: Proceedings of the International Conference on Communications, Circuits and Systems (ICCCAS 2006) Special Session on Complex Networks, pp. 2678–2682 (2006)
Mihalcea, R.: Language independent extractive summarization, pp. 49–52. Association for Computational Linguistics, Morristown (2005)
Page, L., Brin, S., Motwani, R., et al.: The pagerank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab (1998)
Kurland, O., Lee, L.: PageRank without hyperlinks: structural re-ranking using links induced by language models. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 306–313 (2005)
Otterbacher, J., Erkan, G., Radev, D.R.: Using random walks for question-focused sentence retrieval, pp. 915–922. Association for Computational Linguistics, Morristown (2005)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, pp. 271–278 (2004)
Shi, J., Hu, M., Dai, G.Z.: Topic Analysis of Chinese Text Based on Small World Model. Journal of Chinese Information Processing 21(003), 69–75 (2007)
Dorogovtsev, S.N., Mendes, J.F.: Language as an Evolving Word Web. Proceedings: Biological Sciences 268(1485), 2603–2606 (2001)
Ferrer, I., Cancho, R., Sole, R.V.: The small world of human language. Proceedings of the Royal Society B: Biological Sciences 268(1482), 2261–2265 (2001)
Heyer, G., Quasthoff, U., Wittig, T.: Text Mining: Wissensrohstoff Text Konzepte, Algorithmen, Ergebnisse. W3L-Verl. (2006)
Ferrer i Cancho, R., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Physical Review E Phys. Rev. E 69, 051915 (2004)
Yang, L.P., Ji, D.H., Li, T.: Chinese information retrieval based on terms and ontology (2004)
Kando, N.: Overview of the Seventh NTCIR Workshop. In: Proceedings of NTCIR-7 Workshop Meeting, Tokyo, Japan (2008)
Sakai, T., Kando, N., Lin, C., et al.: Overview of the NTCIR-7 ACLIA IR4QA Task. In: Proceedings of the Seventh NTCIR Workshop Meeting, Tokyo, Japan (2008)
Yang, H.: The application of complex network in natural language processing. Wuhan University, Wuhan (2009)
Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yin, L., Yang, H., Ji, D., Zhang, M., Wu, H. (2013). Rapid Increase of the Weighted Shortest Path Length in Key Term Concurrence Network and Its Origin. In: Ji, D., Xiao, G. (eds) Chinese Lexical Semantics. CLSW 2012. Lecture Notes in Computer Science(), vol 7717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36337-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-36337-5_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36336-8
Online ISBN: 978-3-642-36337-5
eBook Packages: Computer ScienceComputer Science (R0)