Skip to main content

Towards Next Generation Web Information Retrieval

  • Conference paper
Web Information Systems – WISE 2004 (WISE 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3306))

Included in the following conference series:

  • 1176 Accesses

Abstract

Today search engines have become one of the most critical applications on the Web, driving many important online businesses that connect people to information. As the Web continues to grow its size with a variety of new data and penetrate into every aspect of people’s life, the need for developing a more intelligent search engine is increasing. In this talk, we will briefly review the current status of search engines, and then present some of our recent works on building next generation web search technologies. Specifically, we will talk about how to extract data records from web pages using vision-based approach, and introduce new research opportunities in exploring the complementary properties between the surface Web and the deep Web to mutually facilitate the processes of web information extraction and deep web crawling. We will also present a search prototype that data-mines deep web structure to enable one-stop search of multiple online web databases.

In contrast with current web search that is essentially document-level ranking and retrieval, an old paradigm in IR for more than 25 years, we will introduce our works in building a new paradigm called object-level web search that aims to automatically discover sub-topics (or taxonomy) for any given query and put retrieved web documents into a meaningful organization. We are developing techniques to provide object-level ranking, trend analysis, and business intelligence when the search is intended to find web objects such as people, papers, conferences, and interest groups.

We will also talk about vertical search opportunities in some emerging new areas such as mobile search and media search. In addition to providing information adaptation on mobile devices, we believe location-based and context-aware search is going to be important for mobile search. We also think that by bridging physical world search to digital world search, many new user scenarios that do not yet exist on desktop search can potentially make a huge impact on the mobile Internet. For media search, we will present those new opportunities in analyzing the multi-typed interrelationship between media objects and other content such as text, hyperlinks, deep web structure, and user interactions for better semantic understanding and indexing of media objects. We will also discuss our goal of continually advancing web search to next level by applying data mining, machine learning, and knowledge discovery techniques into the process of information analysis, organization, retrieval, and visualization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, WY., Zhang, H., Hon, HW. (2004). Towards Next Generation Web Information Retrieval. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds) Web Information Systems – WISE 2004. WISE 2004. Lecture Notes in Computer Science, vol 3306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30480-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30480-7_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23894-2

  • Online ISBN: 978-3-540-30480-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics