Abstract
The capability of OLAP database software systems to handle data complexity comes at a high price for analysts, presenting them a combinatorially vast space of views of a relational database. We respond to the need to deploy technologies sufficient to allow users to guide themselves to areas of local structure by casting the space of “views” of an OLAP database as a combinatorial object of all projections and subsets, and “view discovery” as an search process over that lattice. We equip the view lattice with statistical information theoretical measures sufficient to support a combinatorial optimization process. We outline “hop-chaining” as a particular view discovery algorithm over this object, wherein users are guided across a permutation of the dimensions by searching for successive two-dimensional views, pushing seen dimensions into an increasingly large background filter in a “spiraling” search process. We illustrate this work in the context of data cubes recording summary statistics for radiation portal monitors at US ports.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Gupta, A., Sarawagi, S.: Modeling Multidimensional Databases. In: Proc. 13th Int. Conf. on Data Engineering (1997)
Asimov, D.: The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM J. Statistical Computing 6, 1 (1985)
Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, Accuracy, and Consistency Too: A Holistic Solution to Contingency Table Release. In: Proc. 2007 Conf. Principles of Database Systems (PODS 2007) (2007)
Borgelt, C., Kruse, R.: Graphical Models. Wiley, New York (2002)
Cariou, V., Cubillé, J., Derquenne, C., Goutier, S., Guisnel, F., Klajnmic, H.: Built-In Indicators to Discover Interesting Drill Paths in a Cube. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 33–44. Springer, Heidelberg (2008)
Chaudhuri, S., Umeshwar, D.: An Overview of Data Warehousing and OLAP Technology. ACM SIGMOD Record 26(1), 65–74 (1997)
Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP (On-Line Analitycal Processing) to User-Analysts: An IT Mandate” (1993), www.cs.bgu.ac.il/~dbm031/dw042/Papers/olap_to_useranalysts_wp.pdf
Csiszár, I.: Information-type measures of divergence of probability distributions and indirect observations. Studia Sci. Math. Hung 2, 299–318 (1967)
Davey, B.A., Priestly, H.A.: Introduction to Lattices and Order, 2nd edn. Cambridge University Press, Cambridge (1990)
Gyssens, M., Lakshmanan, L.V.S: A Foundation for Multi-Dimensional Databases. In: Proc. 23rd VLDB Conf., pp. 106–115 (1997)
Joslyn, C.A., Gillen, D., Fernandes, R., Damante, M., Burke, J., Critchlow, T.: Hybrid Relational and Link Analytical Knowledge Discovery for Law Enforcement. In: 2008 IEEE Int. Conf. on Technologies for Homeland Security (HST 2008), pp. 161–166. IEEE, Piscataway (2008)
Joslyn, C., Mniszeiski, S.: DEEP: Data Exploration through Extension and Projection. Los Alamos Technical Report LAUR 02-1330 (2002)
Klir, G., Elias, D.: Architecture of Systems Problem Solving, 2nd edn. Plenum, New York (2003)
Kolda, T.G., Bader, B.W.: Tensor Decompositions and Applications. SIAM Review (in press) (2008)
Krippendorff, K.: Information Theory: Structural Models for Qualitative Data. Sage Publications, Newbury Park (1986)
Kumar, N., Gangopadhyay, A., Bapna, S., Karabatis, G., Chen, Z.: Measuring Interestingness of Discovered Skewed Patters in Data Cubes. Decision Support Systems 46, 429–439 (2008)
Lauritzen, S.L.: Graphical Models, Oxford UP (1996)
Malvestuto, F.M.: Testing Implication of Hierarchical Log-Linear Models for Probability Distributions. Statistics and Computing 6, 169–176 (1996)
Montanari, A., Lizzani, L.: A Projection Pursuit Approach to Variable Selection. Computational Statistics and Data Analysis 35, 463–473 (2001)
Palpanas, T., Koudas, N.: Using Database Aggregates for Approximating Querying and Deviation Detection. IEEE Trans. Knowledge and Data Engineering 17(11), 1–11 (2005)
Pedersen, T.B., Jensen, C.S.: Multidimensional Database Technology. IEEE Computer 34(12), 40–46 (2001)
Sarawagi, S.: Explaining Differences in Multidimensional Aggregates. In: Proc. 25th Int. Conf. Very Large Databases, VLDB 1999 (1999)
Sarawagi, S.: User-Adaptive Exploration of Multidimensional Data. In: Proc. 26th Very Large Database Conf., pp. 307–316 (2000)
Sarawagi, S.: iDiff: Informative Summarization of Differences in Multidimensional Aggregates. Data Mining and Knowledge Discovery 5, 255–276 (2001)
Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven Exploration of OLAP Data Cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998)
Soshani, A.: OLAP and Statistical Databases: Simlarities and Differences. In: Proc. PODS 1997, pp. 185–196 (1997)
Studeny, M.: Probabilistic Conditional Independence Structures. Springer, London (2005)
Thomas, H., Datta, A.: A Conceptual Model and Algebra for On-Line Analytical Processing in Decision Support Databases. Information Systems Research 12(1), 83–102 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Joslyn, C., Burke, J., Critchlow, T., Hengartner, N., Hogan, E. (2009). View Discovery in OLAP Databases through Statistical Combinatorial Optimization. In: Winslett, M. (eds) Scientific and Statistical Database Management. SSDBM 2009. Lecture Notes in Computer Science, vol 5566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02279-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-02279-1_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02278-4
Online ISBN: 978-3-642-02279-1
eBook Packages: Computer ScienceComputer Science (R0)