Abstract
The impressive advances in global networking and information technology provide great opportunities for all kinds of Web-based information services, ranging from digital libraries and information discovery to virtual-enterprise workflows and electronic commerce. However, many of these services still exhibit rather poor quality in terms of unacceptable performance during load peaks, frequent and long outages, and unsatisfactory search results.F or the next decade, the overriding goal of database research should be to provide means for building zero-administration, self-tuning information services with predictable response time, virtually continuous availability, and, ultimately, “moneyback” service-quality guarantees.A particularly challenging aspect of this theme is the quality of search results in digital libraries, scientific data repositories, and on the Web. To aim for more intelligent search that can truly find needles in haystacks, classical information retrieval methods should be integrated with querying capabilities for structurally richer Web data, most notably XML data, and automatic classi.cation methods based on standardized ontologies and statistical machine learning. This paper gives an overview of promising research directions along these lines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, P. Buneman, D. Suciu: Data on the Web-From Relations to Semistructured Data and XML, Morgan Kaufmann, 2000.
The Asilomar Report on Database Research, ACM SIGMOD Record Vol.27 No.4, 1998.
R. Baeza-Yates, B. Ribeiro-Neto: Modern Information Retrieval, Addison-Wesley, 1999.
BrightPlanet.com: The Deep Web: Surfacing Hidden Value, White Paper, http://www.completeplanet.com/Tutorials/DeepWeb/index.asp.
S. Brin, L. Page: The Anatomy of a Large Scale Hypertextual Web Search Engine, 7th WWW Conference, 1998.
S. Chakrabarti, B. Dom, R. Agrawal, P. Raghavan: Scalable Feature Selection, Classification and Signature Generation for Organizing Large Text Databases into Hierarchical Topic Taxonomies, The VLDB Journal Vol.7 No.3, 1998.
S. Chakrabarti, M. van den Berg, B. Dom: Focused Crawling: A New Approach to Topic-specific Web Resource Discovery, 8th WWW Conference, 1999.
D.D. Chamberlin, J. Robie, D. Florescu: Quilt: An XML Query Language for Heterogeneous Data Sources, 3rd Int.Workshop on the Web and Databases, 2000.
S. Chaudhuri (Editor): Special Issue on Self-Tuning Databases and Application Tuning, IEEE Data Engineering Bulletin Vol.22 No.2, 1999.
W.W. Cohen: Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity, ACM SIGMOD Conference, 1998.
F. Cristian: Understanding Fault-Tolerant Distributed Systems, Communications of the ACM Vol.34 No.2, 1991.
A. Deutsch, M.F. Fernandez, D. Florescu, A.Y. Levy, D. Suciu: A Query Language for XML, 8th WWW Conference, 1999.
S. Dumais, H. Chen: Hierarchical Classification of Web Content, ACM SIGIR Conference, 2000.
C. Faloutsos: Searching Multimedia Databases By Content, Kluwer Academic Publishers, 1996.
C. Fellbaum (Editor): WordNet: An Electronic Lexical Database, MIT Press, 1998.
N. Fuhr, K. Groβjohann: XIRQL: An Extension of XQL for Information Retrieval, ACM SIGIR Workshop on XML and Information Retrieval, 2000.
Gene Ontology Consortium, http://www.geneontology.org.
M. Gillmann, J. Weissenfels, G. Weikum, A. Kraiss: Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems, 7th Int.Conference on Extending Database Technology, 2000.
Graduate Studies Program on “Quality Guarantees for Computer Systems”, Funded by the German Science Foundation (Deutsche Forschungsgemeinschaft), Department of Computer Science, University of the Saarland, Saarbruecken, Germany, http://www-dbs.cs.uni-sb.de/~weikum/gk.
J. Gray, A. Reuter: Transaction Processing: Concepts and Techniques, Morgan Kaufmann, 1993.
J. Gray: What Next? A Dozen Information-Technology Research Goals, Technical Report MS-TR-99-50, Microsoft Research, Redmond, 1999.
W.I. Grosky, R. Jain, R. Mehrotra: The Handbook of Multimedia Information Management, Prentice Hall, 1997.
V.N. Gudivada, V.V. Raghavan, W.I. Grosky, R. Kasanagottu: Information Retrieval on the World Wide Web, IEEE Internet Computing Vol.1 No.5, 1997.
G. Haring, C. Lindemann, M. Reiser (Eds.): Performance Evaluation: Origins and Directions, Springer, 2000.
M.A. Hearst (Ed.): Trends and Controversies: Support Vector Machines, IEEE Intelligent Systems, Vol.13 No.4, 1998.
J. Heflin, J. Hendler: Dynamic Ontologies on the Web, Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000), 2000.
M.N. Huhns, L.N. Stephens: Personal Ontologies, IEEE Internet Computing, Vol.3 No.5, 1999.
J. M. Kleinberg: Authoritative Sources in a Hyperlinked Environment, Journal of the ACM Vol.46 No.5, 1999.
D. Kossmann (Editor), Special Issue on XML, IEEE Data Engineering Bulletin Vol.22 No.3, 1999.
A. Kraiss and G. Weikum: Integrated Document Caching and Prefetching in Storage Hierarchies Based On Markov-Chain Predictions, The VLDB Journal Vol.7 No.3, 1998.
D. Lenat, R.V. Guha: Building Large Knowledge Based Systems, Addison-Wesley, 1990.
D.D. Lewis: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval, European Conference on Machine Learning, 1998.
D. Lomet, G. Weikum: Efficient and Transparent Application Recovery in Client-Server Information Systems, ACM SIGMOD Conference, 1998.
P. Mitra, G. Wiederhold, M.L. Kersten: Articulation of Ontology Interdependencies Using a Graph-Oriented Approach, 7th Int.Conference on Extending Database Technology, 2000.
R. Nelson: Probability, Stochastic Processes, and Queueing Theory: The Mathematics of Computer Performance Modeling, Springer, 1995.
G. Nerjes, P. Muth, and G. Weikum: Stochastic Service Guarantees for Continuous Data on Multi-Zone Disks, ACM Int.Symposium on Principles of Database Systems, 1997.
G. Nerjes, P. Muth, G. Weikum: A Performance Model of Mixed-Workload Multimedia Information Servers, 10th GI/NTG Conference on Performance Evaluation of Computer and Communication Systems, 1999.
OpenMath Content Dictionaries, http://www.openmath.org/cd.les/html/extra.
Oracle8i with Oracle Fail Safe 3.0, White Paper, Oracle Corporation, 2000, http://www.oracle.com/tech/nt/failsafe/pdf/ofs30db.pdf.
C.H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala: Latent Semantic Indexing: A Probabilistic Analysis, ACM Int.Symposium on Principles of Database Systems, 1998.
RosettaNet Partner Interface Processes, http://www.rosettanet.org.
S. Russell, P. Norvig: Artificial Intelligence-A Modern Approach, Prentice Hall, 1995.
A. Silberschatz, M. Stonebraker, J. Ullman (Editors): Database Research: Achievements and Opportunities Into the 21st Century, ACM SIGMOD Record Vol.25 No.1, 1996.
A. Silberschatz, S. Zdonik, et al.: Strategic Directions in Database Systems-Breaking Out of the Box, ACM Computing Surveys Vol.28 No.4, 1996.
S. Staab, J. Angele, S. Decker, M. Erdmann, A. Hotho, A. Mädche, H.-P. Schnurr, R. Studer: Semantic Community Web Portals, 9th WWW Conference, 2000.
A. Sugiura, O. Etzioni: Query Routing for Web Search Engines: Architecture and Experiments, 9th WWW Conference, 2000.
A. Theobald, G. Weikum: Adding Relevance to XML, 3rd Int.Workshop on the Web and Databases, 2000.
U S President’s Information Technology Advisory Committee Interim Report to the President, August 1998, http://www.ccic.gov/ac/interim/.
G. Weikum, C. Hasse, A. Moenkeberg, P. Zabback.: The COMFORT Automatic Tuning Project, Information Systems Vol.19 No.5, 1994.
G. Weikum: Towards Guaranteed Quality and Dependability of Information Services) (Invited Keynote), 8th German Conference on Databases in Office, Engineering, and Scientific Applications, 1999.
G. Weikum, G. Vossen: Fundamentals of Transactional Information Systems: Theory, Algorithms, and Practice of Concurrency Control and Recovery, Morgan Kaufmann, 2001.
The XML Cover Pages, http://www.oasis-open.org/cover/xml.html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Weikum, G. (2001). The Web in 2010: Challenges and Opportunities for Database Research. In: Wilhelm, R. (eds) Informatics. Lecture Notes in Computer Science, vol 2000. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44577-3_1
Download citation
DOI: https://doi.org/10.1007/3-540-44577-3_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41635-7
Online ISBN: 978-3-540-44577-7
eBook Packages: Springer Book Archive