Enriching the Contents of Enterprises’ Wiki Systems with Web Information

Zhao, Li; Wang, Yexin; Huang, Congrui; Zhang, Yan

doi:10.1007/978-3-642-16720-1_24

Li Zhao^25,26,
Yexin Wang^25,26,
Congrui Huang^25,26 &
…
Yan Zhang^25,26

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6185))

Included in the following conference series:

International Conference on Web-Age Information Management

1361 Accesses

Abstract

Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be generated from the pages on the Intranet of an enterprise. However, the information on such internal pages cannot cover all aspects of the entities. To solve this problem, this paper tries to enrich the explanations of entities by exploiting Web pages on the Internet. This task consists of three steps. First, it obtains pages from the Internet for each entity as an initial page set with the help of search engines. Secondly, it locates the pages which have a high correlation with the entity from the page set. At last, it produces new snippets from such pages and chooses those which can enhance the explanation and throw away the redundant ones. Each candidate snippet is evaluated by two aspects: the correlation between it and the entity, and its ability to enhance the existing explanation. The experimental results based on a real data set show that our proposed method works effectively in supplementing the existing explanation by exploiting web pages from outside the enterprise.

Supported by NSFC under Grant No.60673129 and 60773162,863 Program under Grant No.2007AA01Z154, and the 2008/2009 HP Labs Innovation Research Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berger, A.L., Mittal, V.O.: Ocelot: A system for summarizing web pages. In: SIGIR (2000)
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (1990)
Google Scholar
Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing text documents: Sentence selection and evaluation metrics. In: SIGIR (1999)
Google Scholar
Knight, K., Marcu, D.: Statistics-based summarization - step one: Sentence compression. In: AAAI (2000)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Radev, D.R., Fan, W., Zhang, Z.: Webinessence: A personalized web-based multi-document summarization and recommendation system. In: NAACL Workshop on Automatic Summarization (2001)
Google Scholar
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)
Article Google Scholar
Steinberger, J., Jezek, K.: Update summarization based on novel topic distribution. In: Proceedings of the 9th ACM Symposium on Document Engineering (2009)
Google Scholar
Wan, X., Yang, J., Xiao, J.: Collabsum: exploiting multiple document clustering for collaborative single document summarizations. In: SIGIR (2007)
Google Scholar
Wan, X., Yang, J., Xiao, J.: Manifold-ranking based topic-focused multi-document summarization. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (2007)
Google Scholar
Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simulataneous document summarization and keyword extraction. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (2007)
Google Scholar
Wang, Y., Zhao, L., Zhang, Y.: Magiccube: choosing the best snippet for each aspect of an entity. In: CIKM (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Machine Intelligence, Peking University, Beijing, 100871, China
Li Zhao, Yexin Wang, Congrui Huang & Yan Zhang
Key Laboratory on Machine Perception, Ministry of Education, Beijing, 100871, China
Li Zhao, Yexin Wang, Congrui Huang & Yan Zhang

Authors

Li Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yexin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Congrui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD, Australia
Heng Tao Shen
School of Computing Science, Simon Fraser University, 8888 University Drive, V5A 1S6, Burnaby, BC, Canada
Jian Pei
David R. Cheriton School of Computer Science, University of Waterloo, Canada
M. Tamer Özsu
Peking University, China
Lei Zou
Renmin University of China, China
Jiaheng Lu
National University of Singapore, Singapore
Tok-Wang Ling
Northeastern University, 110004, Shenyang, China
Ge Yu
College of Computer Science, Zhejiang University, 310027, Hangzhou, P.R. China
Yi Zhuang
University of Melbourne, Australia
Jie Shao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, L., Wang, Y., Huang, C., Zhang, Y. (2010). Enriching the Contents of Enterprises’ Wiki Systems with Web Information. In: Shen, H.T., et al. Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16720-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-16720-1_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16719-5
Online ISBN: 978-3-642-16720-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics