Abstract
The high cost of data consolidation is the key market inhibitor to the adoption of traditional information integration and data warehousing solutions. In this paper, we outline a next-generation integrated database management system that takes traditional information integration, content management, and data warehouse techniques to the next level: the system will be able to integrate a very large number of information sources and automatically construct a global business view in terms of “Universal Business Objects”. We describe techniques for discovering, unifying, and aggregating data from a large number of disparate data sources. Enabling technologies for our solution are XML, web services, caching, messaging, and portals for real-time dashboarding and reporting.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Angluin, D.: On the Complexity of Minimum Inference of Regular Sets. Information and Control 39(3), 337–350 (1978)
Astrahan, M., Schkolnick, M., Whang, K.-Y.: Approximating the Number of Unique Values of an Attribute without Sorting. Information Systems 12(1), 11–15 (1987)
Brown, P.G., Haas, P.J.: BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data. In: Proc. 29th VLDB, pp. 668–679 (2003)
Carrasco, R.C., Oncina, J.: Learning Stochastic Regular Grammars by Means of a State Merging Method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 139–152. Springer, Heidelberg (1994)
Gibbons, P.B., Matias, Y.: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. In: Proc. SIGMOD 1998, pp. 331–342 (1998)
Ilyas, I., Markl, V., Haas, P.J., Brown, P.G., Aboulnaga, A.: CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies. In: Proc. SIGMOD 2004, pp. 647–658 (2004)
Poosala, V., Ioannidis, Y.E., Haas, P.J., Shekita, E.J.: Improved Histograms for Selectivity Estimation of Range Predicates. In: Proc. SIGMOD 1996, pp. 294–305 (1996)
Pitt, L.: Inductive Inference, DFAs and Computational Complexity. In: 2nd Int. Workshop on Analogical and Inductive Inference (AII), pp. 18–44 (1989)
Sismanis, Y., Roussopoulos, N.: The Polynomial Complexity of Fully Materialized Coalesced Cubes. In: Proc. VLDB 2004, pp. 540–551 (2004)
Sismanis, Y., Roussoupoulos, N.: Maintaining Implicated Statistics in Constrained Environments. In: Proc. ICDE (2005)
Teradata Corporation: Getting it Together. Data Warehousing Report 5(4) (August 2003), Online at http://www.teradata.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Brown, P., Haas, P., Myllymaki, J., Pirahesh, H., Reinwald, B., Sismanis, Y. (2005). Toward Automated Large-Scale Information Integration and Discovery. In: Härder, T., Lehner, W. (eds) Data Management in a Connected World. Lecture Notes in Computer Science, vol 3551. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11499923_9
Download citation
DOI: https://doi.org/10.1007/11499923_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26295-4
Online ISBN: 978-3-540-31654-1
eBook Packages: Computer ScienceComputer Science (R0)