Skip to main content

View Discovery in OLAP Databases through Statistical Combinatorial Optimization

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5566))

  • 1449 Accesses

Abstract

The capability of OLAP database software systems to handle data complexity comes at a high price for analysts, presenting them a combinatorially vast space of views of a relational database. We respond to the need to deploy technologies sufficient to allow users to guide themselves to areas of local structure by casting the space of “views” of an OLAP database as a combinatorial object of all projections and subsets, and “view discovery” as an search process over that lattice. We equip the view lattice with statistical information theoretical measures sufficient to support a combinatorial optimization process. We outline “hop-chaining” as a particular view discovery algorithm over this object, wherein users are guided across a permutation of the dimensions by searching for successive two-dimensional views, pushing seen dimensions into an increasingly large background filter in a “spiraling” search process. We illustrate this work in the context of data cubes recording summary statistics for radiation portal monitors at US ports.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Gupta, A., Sarawagi, S.: Modeling Multidimensional Databases. In: Proc. 13th Int. Conf. on Data Engineering (1997)

    Google Scholar 

  2. Asimov, D.: The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM J. Statistical Computing 6, 1 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  3. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, Accuracy, and Consistency Too: A Holistic Solution to Contingency Table Release. In: Proc. 2007 Conf. Principles of Database Systems (PODS 2007) (2007)

    Google Scholar 

  4. Borgelt, C., Kruse, R.: Graphical Models. Wiley, New York (2002)

    MATH  Google Scholar 

  5. Cariou, V., Cubillé, J., Derquenne, C., Goutier, S., Guisnel, F., Klajnmic, H.: Built-In Indicators to Discover Interesting Drill Paths in a Cube. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 33–44. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Chaudhuri, S., Umeshwar, D.: An Overview of Data Warehousing and OLAP Technology. ACM SIGMOD Record 26(1), 65–74 (1997)

    Article  Google Scholar 

  7. Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP (On-Line Analitycal Processing) to User-Analysts: An IT Mandate” (1993), www.cs.bgu.ac.il/~dbm031/dw042/Papers/olap_to_useranalysts_wp.pdf

  8. Csiszár, I.: Information-type measures of divergence of probability distributions and indirect observations. Studia Sci. Math. Hung 2, 299–318 (1967)

    MATH  Google Scholar 

  9. Davey, B.A., Priestly, H.A.: Introduction to Lattices and Order, 2nd edn. Cambridge University Press, Cambridge (1990)

    Google Scholar 

  10. Gyssens, M., Lakshmanan, L.V.S: A Foundation for Multi-Dimensional Databases. In: Proc. 23rd VLDB Conf., pp. 106–115 (1997)

    Google Scholar 

  11. Joslyn, C.A., Gillen, D., Fernandes, R., Damante, M., Burke, J., Critchlow, T.: Hybrid Relational and Link Analytical Knowledge Discovery for Law Enforcement. In: 2008 IEEE Int. Conf. on Technologies for Homeland Security (HST 2008), pp. 161–166. IEEE, Piscataway (2008)

    Chapter  Google Scholar 

  12. Joslyn, C., Mniszeiski, S.: DEEP: Data Exploration through Extension and Projection. Los Alamos Technical Report LAUR 02-1330 (2002)

    Google Scholar 

  13. Klir, G., Elias, D.: Architecture of Systems Problem Solving, 2nd edn. Plenum, New York (2003)

    Book  MATH  Google Scholar 

  14. Kolda, T.G., Bader, B.W.: Tensor Decompositions and Applications. SIAM Review (in press) (2008)

    Google Scholar 

  15. Krippendorff, K.: Information Theory: Structural Models for Qualitative Data. Sage Publications, Newbury Park (1986)

    Book  Google Scholar 

  16. Kumar, N., Gangopadhyay, A., Bapna, S., Karabatis, G., Chen, Z.: Measuring Interestingness of Discovered Skewed Patters in Data Cubes. Decision Support Systems 46, 429–439 (2008)

    Article  Google Scholar 

  17. Lauritzen, S.L.: Graphical Models, Oxford UP (1996)

    Google Scholar 

  18. Malvestuto, F.M.: Testing Implication of Hierarchical Log-Linear Models for Probability Distributions. Statistics and Computing 6, 169–176 (1996)

    Article  Google Scholar 

  19. Montanari, A., Lizzani, L.: A Projection Pursuit Approach to Variable Selection. Computational Statistics and Data Analysis 35, 463–473 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  20. Palpanas, T., Koudas, N.: Using Database Aggregates for Approximating Querying and Deviation Detection. IEEE Trans. Knowledge and Data Engineering 17(11), 1–11 (2005)

    Article  Google Scholar 

  21. Pedersen, T.B., Jensen, C.S.: Multidimensional Database Technology. IEEE Computer 34(12), 40–46 (2001)

    Article  Google Scholar 

  22. Sarawagi, S.: Explaining Differences in Multidimensional Aggregates. In: Proc. 25th Int. Conf. Very Large Databases, VLDB 1999 (1999)

    Google Scholar 

  23. Sarawagi, S.: User-Adaptive Exploration of Multidimensional Data. In: Proc. 26th Very Large Database Conf., pp. 307–316 (2000)

    Google Scholar 

  24. Sarawagi, S.: iDiff: Informative Summarization of Differences in Multidimensional Aggregates. Data Mining and Knowledge Discovery 5, 255–276 (2001)

    Article  MATH  Google Scholar 

  25. Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven Exploration of OLAP Data Cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  26. Soshani, A.: OLAP and Statistical Databases: Simlarities and Differences. In: Proc. PODS 1997, pp. 185–196 (1997)

    Google Scholar 

  27. Studeny, M.: Probabilistic Conditional Independence Structures. Springer, London (2005)

    MATH  Google Scholar 

  28. Thomas, H., Datta, A.: A Conceptual Model and Algebra for On-Line Analytical Processing in Decision Support Databases. Information Systems Research 12(1), 83–102 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Joslyn, C., Burke, J., Critchlow, T., Hengartner, N., Hogan, E. (2009). View Discovery in OLAP Databases through Statistical Combinatorial Optimization. In: Winslett, M. (eds) Scientific and Statistical Database Management. SSDBM 2009. Lecture Notes in Computer Science, vol 5566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02279-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02279-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02278-4

  • Online ISBN: 978-3-642-02279-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics