ABSTRACT
A major difference between corporate intranets and the Internet is that in intranets the barrier for users to create web pages is much higher. This limits the amount and quality of anchor text, one of the major factors used by Internet search engines, making intranet search more difficult. The social phenomenon at play also means that spam is relatively rare. Both on the Internet and in intranets, users are often willing to cooperate with the search engine in improving the search experience. These characteristics naturally lead to considering using user feedback to improve search quality in intranets. In this paper we show how a particular form of feedback, namely user annotations, can be used to improve the quality of intranet search. An annotation is a short description of the contents of a web page, which can be considered a substitute for anchor text. We propose two ways to obtain user annotations, using explicit and implicit feedback, and show how they can be integrated into a search engine. Preliminary experiments on the IBM intranet demonstrate that using annotations improves the search quality.
- Anchor text optimization. www.seo-gold.com/tutorial/anchor-text-optimization.htmlGoogle Scholar
- Google enterprise solutions. http://www.google.com/enterprise/http://www.google.com/enterprise/.Google Scholar
- IBM OmniFind solution for enterprise search. http://www-306.ibm.com/software/data/integration/db2ii/editions_womnifind.htmlGoogle Scholar
- NCSA mosaic: Annotations overview. http://archive.ncsa.uiuc.edu/SDG/Software/XMosaic/Annotations/overview.htmlGoogle Scholar
- Panoptic enterprise search engine. http://www.panopticsearch.comhttp://www.panopticsearch.com.Google Scholar
- StumbleUpon. http://www.stumbleupon.comhttp://www.stumbleupon.com.Google Scholar
- Verity enterprise search solution. http://www.verity.com/products/search/enterprise_web_search/index.htmlGoogle Scholar
- Yahoo! MyWeb 2.0 BETA. http://myweb2.search.yahoo.com/Google Scholar
- Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In Proc. Proc. 7th World Wide Web Conference, Brisbane, Australia, 1998, pages 107--117, 1998. Google ScholarDigital Library
- Vannevar Bush. As we may think. In The Atlantic Monthly, July 1945.Google Scholar
- Junghoo Cho and Sourashis Roy. Impact of search engines on page popularity. In Proc. 13th World Wide Web Conference, pages 20--29, May 2004. Google ScholarDigital Library
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, Cambridge, MA, 2003. Google ScholarDigital Library
- Laurent Denoue and Laurence Vignollet. New ways of using web annotations. In Proc. 9th World Wide Web Conference, Amsterdam, 2000.Google Scholar
- Nadav Eiron and Kevin S. McCurley. Analysis of anchor text for web search. In Proc. 26th ACM Conference on Research and Development in Information Retrieval, pages 459--460, 2003. Google ScholarDigital Library
- Ronald Fagin, Ravi Kumar, Kevin S. McCurley, Jasmine Novak, D. Sivakumar, John A. Tomlin, and David P. Williamson. Searching the workplace web. In Proc. 12th World Wide Web Conference, Budapest, Hungary, 2003. Google ScholarDigital Library
- Susan Feldman and Chris Sherman. The high cost of not finding information. In IDC Technical Report 29127, 2003.Google Scholar
- Marcus Fontoura, Eugene J. Shekita, Jason Y. Zien, Sridhar Rajagopalan, and Andreas Neumann. High performance index build algorithms for intranet search engines. In VLDB, pages 1158--1169, 2004. Google ScholarDigital Library
- David Hawking. Challenges in enterprise search. In Fifteenth Australian Database Conference, Dunedin, NZ, 2004. Google ScholarDigital Library
- Thorsten Joachims. Optimizing search engines using clickthrough data. In Proc. 8th ACM Conference on Knowledge Discovery and Data Mining, Alberta, Canada, 2002. Google ScholarDigital Library
- Thorsten Joachims, Dayne Freitag, and Tom Mitchell. Webwatcher: A tour guide for the world wide web. In Proc. International Joint Conference on Artificial Intelligence, Nagoya, Japan, 1997.Google Scholar
- Thorsten Joachims, Laura Granka, Bing Pang, Helene Hembrooke, and Geri Gay. Accurately interpreting clickthrough data as implicit feedback. In Proc. 28th ACM Conference on Research and Development in Information Retrieval, Salvador, Brazil, 2005. Google ScholarDigital Library
- Charles Kemp and Kotagiri Ramamohanarao. Long-time learning for web search engines. In Proc. 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, Helsinki, Finland, 2002. Google ScholarDigital Library
- Hannes Marais and Krishna Bharat. Supporting cooperative and personal surfing with a desktop assistant. In 10th annual ACM symposium on User Interface Software and Technology, Banff, Alberta, Canada, 1997. Google ScholarDigital Library
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998. Paper SIDL-WP-1999-0120 (version of 11/11/1999).Google Scholar
- Filip Radlinski and Thorsten Joachims. Query chains: Learning to rank from implicit feedback. In Proc. 11th ACM Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 2005. Google ScholarDigital Library
- Robert Sedgewick. Algorithms in C++. Addison-Wesley Publishing Company, Boston, MA, 1998. Google ScholarDigital Library
- Venu Vasudevan and Mark Palmer. On web annotations: Promises and pitfalls of current web infrastructure. In 32nd Hawaii International Conference on Systems Sciences, Maui, Hawaii, 1999. Google ScholarDigital Library
- Vishwa Vinay, Ken Wood, Natasa Milic-Frayling, and Ingemar J. Cox. Comparing relevance feedback algorithms for web search. In Proc. 14th World Wide Web Conference, Chiba, Japan, 2005. Google ScholarDigital Library
- I. Witten, A. Moffat, and T. Bell. Managing Gigabytes. Morgan Kaufmann, 1999.Google Scholar
Index Terms
- Using annotations in enterprise search
Recommendations
Search result diversification for enterprise data
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementSearch result diversification aims to return a list of diversified relevant documents in order to satisfy different user information needs. Most of the efforts focused on Web Search, and few studies have considered another important search domain, i.e., ...
Using weighted tagging to facilitate enterprise search
ECIR'2010: Proceedings of the 32nd European conference on Advances in Information RetrievalMotivated by the success of social tagging in web communities, this paper proposes a novel document tagging method more suitable for the enterprise environment, named weighted tagging. The method allows users to tag a document with weighted tags which ...
Social search and discovery using a unified approach
WWW '09: Proceedings of the 18th international conference on World wide webWe explore new ways of improving a search engine using data from Web 2.0 applications such as blogs and social bookmarks. This data contains entities such as documents, people and tags, and relationships between them. We propose a simple yet effective ...
Comments