Abstract
Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be generated from the pages on the Intranet of an enterprise. However, the information on such internal pages cannot cover all aspects of the entities. To solve this problem, this paper tries to enrich the explanations of entities by exploiting Web pages on the Internet. This task consists of three steps. First, it obtains pages from the Internet for each entity as an initial page set with the help of search engines. Secondly, it locates the pages which have a high correlation with the entity from the page set. At last, it produces new snippets from such pages and chooses those which can enhance the explanation and throw away the redundant ones. Each candidate snippet is evaluated by two aspects: the correlation between it and the entity, and its ability to enhance the existing explanation. The experimental results based on a real data set show that our proposed method works effectively in supplementing the existing explanation by exploiting web pages from outside the enterprise.
Supported by NSFC under Grant No.60673129 and 60773162,863 Program under Grant No.2007AA01Z154, and the 2008/2009 HP Labs Innovation Research Program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berger, A.L., Mittal, V.O.: Ocelot: A system for summarizing web pages. In: SIGIR (2000)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (1990)
Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing text documents: Sentence selection and evaluation metrics. In: SIGIR (1999)
Knight, K., Marcu, D.: Statistics-based summarization - step one: Sentence compression. In: AAAI (2000)
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Radev, D.R., Fan, W., Zhang, Z.: Webinessence: A personalized web-based multi-document summarization and recommendation system. In: NAACL Workshop on Automatic Summarization (2001)
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)
Steinberger, J., Jezek, K.: Update summarization based on novel topic distribution. In: Proceedings of the 9th ACM Symposium on Document Engineering (2009)
Wan, X., Yang, J., Xiao, J.: Collabsum: exploiting multiple document clustering for collaborative single document summarizations. In: SIGIR (2007)
Wan, X., Yang, J., Xiao, J.: Manifold-ranking based topic-focused multi-document summarization. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (2007)
Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simulataneous document summarization and keyword extraction. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (2007)
Wang, Y., Zhao, L., Zhang, Y.: Magiccube: choosing the best snippet for each aspect of an entity. In: CIKM (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, L., Wang, Y., Huang, C., Zhang, Y. (2010). Enriching the Contents of Enterprises’ Wiki Systems with Web Information. In: Shen, H.T., et al. Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16720-1_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-16720-1_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16719-5
Online ISBN: 978-3-642-16720-1
eBook Packages: Computer ScienceComputer Science (R0)