Skip to main content

Automated Explanations of User-Expected Trends for Aggregate Queries

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10937))

Included in the following conference series:

  • 5024 Accesses

Abstract

Recently, a deeper level of data exploration has emerged enabling users to infer anomalies in their queries. This exploration level strives to explain why a particular anomaly exists within a query result by providing a set of explanations. These explanations are precisely a set of alterations, such that when applied on the original query cause anomalies to disappear. Trends are pattern changes in business applications generated based on SQL aggregated queries. Additionally, a user expected trend is a particular pattern change in data was supposedly happen based on businesses studies.

In this paper, we generalize this process to automatically produce explanations for users expected trends. We propose User Trend Explanations (UTE) framework which provides insightful explanations by taking a set of user-specified points (called prospective trend), and finds a top explanation that produce this trend. We develop a notion of uniformity of a predicate on a given output, and implement a set of algorithms to search the data space efficiently and effectively. The key idea is harnessing the linear search space rather than the exponential space to enable accurate explanations that are possible with tuples. Our experiments on real datasets show significant improvements UTE provides when compared with state-of-the-art related algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/ibrahimDKE/UTE_Xtrends.

  2. 2.

    http://www.fec.gov/disclosurep/PDownload.do.

References

  1. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD 1998, pp. 94–105 (1998)

    Google Scholar 

  2. Bender, G., Kot, L., Gehrke, J.: Explainable security for relational databases. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014, pp. 1411–1422 (2014)

    Google Scholar 

  3. Breiman, L.: Classification and regression trees. Wadsworth International Group (1984)

    Google Scholar 

  4. Das, M., Amer-Yahia, S., Das, G., Yu, C.: MRI: meaningful interpretations of collaborative ratings. PVLDB 4(11), 1063–1074 (2011)

    Google Scholar 

  5. Das, M., Amer-Yahia, S., Das, G., Yu, C.: MRI: Meaningful interpretations of collaborative ratings. Am. Soc. Mech. Eng. (Paper) 4(11), 1063–1074 (2011)

    Google Scholar 

  6. Ibrahim, I.A., Albarrak, A.M., Li, X.: Constrained recommendations for query visualizations. Knowl. Inf. Syst., 1–31 (2016). https://doi.org/10.1007/s10115-016-1001-5

  7. Kanagal, B., Li, J., Deshpande, A.: Sensitivity analysis and explanations for robust query evaluation in probabilistic databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, 12–16 June 2011

    Google Scholar 

  8. Khoussainova, N., Balazinska, M., Suciu, D.: Perfxplain: debugging mapreduce job performance. Proc. VLDB Endow. 5(7), 598–609 (2012)

    Article  Google Scholar 

  9. Mackinlay, J.D., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. IEEE Trans. Vis. Comput. Graph. 13(6), 1137–1144 (2007)

    Article  Google Scholar 

  10. Meliou, A., Gatterbauer, W., Halpern, J.Y., Koch, C., Moore, K.F., Suciu, D.: Causality in databases. IEEE Data Eng. Bull. 33(3), 59–67 (2010)

    Google Scholar 

  11. Roy, S., Orr, L., Suciu, D.: Explaining query answers with explanation-ready databases. Proc. VLDB Endow. 9(4), 348–359 (2015)

    Article  Google Scholar 

  12. Roy, S., Suciu, D.: A formal approach to finding explanations for database queries. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014, pp. 1579–1590 (2014)

    Google Scholar 

  13. Saltelli, A.: The critique of modelling and sensitivity analysis in the scientific discourse. TAUC (2006)

    Google Scholar 

  14. Sarawagi, S.: Explaining differences in multidimensional aggregates. In: Proceedings of the 25th International Conference on Very Large Data Bases, VLDB 1999, pp. 42–53. Morgan Kaufmann Publishers Inc., San Francisco (1999)

    Google Scholar 

  15. Sathe, G., Sarawagi, S.: Intelligent rollups in multidimensional olap data. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 531–540. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  16. Wu, E., Madden, S.: Scorpion: explaining away outliers in aggregate queries. Proc. VLDB Endow. 6(8), 553–564 (2013). https://doi.org/10.14778/2536354.2536356

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibrahim A. Ibrahim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ibrahim, I.A., Li, X., Zhao, X., Maskari, S.A., Albarrak, A.M., Zhang, Y. (2018). Automated Explanations of User-Expected Trends for Aggregate Queries. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93034-3_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93033-6

  • Online ISBN: 978-3-319-93034-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics