Automated Explanations of User-Expected Trends for Aggregate Queries

Ibrahim, Ibrahim A.; Li, Xue; Zhao, Xin; Maskari, Sanad Al; Albarrak, Abdullah M.; Zhang, Yanjun

doi:10.1007/978-3-319-93034-3_48

Ibrahim A. Ibrahim¹⁹,
Xue Li¹⁹,
Xin Zhao¹⁹,
Sanad Al Maskari¹⁹,
Abdullah M. Albarrak¹⁹ &
…
Yanjun Zhang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10937))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

5024 Accesses

Abstract

Recently, a deeper level of data exploration has emerged enabling users to infer anomalies in their queries. This exploration level strives to explain why a particular anomaly exists within a query result by providing a set of explanations. These explanations are precisely a set of alterations, such that when applied on the original query cause anomalies to disappear. Trends are pattern changes in business applications generated based on SQL aggregated queries. Additionally, a user expected trend is a particular pattern change in data was supposedly happen based on businesses studies.

In this paper, we generalize this process to automatically produce explanations for users expected trends. We propose User Trend Explanations (UTE) framework which provides insightful explanations by taking a set of user-specified points (called prospective trend), and finds a top explanation that produce this trend. We develop a notion of uniformity of a predicate on a given output, and implement a set of algorithms to search the data space efficiently and effectively. The key idea is harnessing the linear search space rather than the exponential space to enable accurate explanations that are possible with tuples. Our experiments on real datasets show significant improvements UTE provides when compared with state-of-the-art related algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD 1998, pp. 94–105 (1998)
Google Scholar
Bender, G., Kot, L., Gehrke, J.: Explainable security for relational databases. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014, pp. 1411–1422 (2014)
Google Scholar
Breiman, L.: Classification and regression trees. Wadsworth International Group (1984)
Google Scholar
Das, M., Amer-Yahia, S., Das, G., Yu, C.: MRI: meaningful interpretations of collaborative ratings. PVLDB 4(11), 1063–1074 (2011)
Google Scholar
Das, M., Amer-Yahia, S., Das, G., Yu, C.: MRI: Meaningful interpretations of collaborative ratings. Am. Soc. Mech. Eng. (Paper) 4(11), 1063–1074 (2011)
Google Scholar
Ibrahim, I.A., Albarrak, A.M., Li, X.: Constrained recommendations for query visualizations. Knowl. Inf. Syst., 1–31 (2016). https://doi.org/10.1007/s10115-016-1001-5
Kanagal, B., Li, J., Deshpande, A.: Sensitivity analysis and explanations for robust query evaluation in probabilistic databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, 12–16 June 2011
Google Scholar
Khoussainova, N., Balazinska, M., Suciu, D.: Perfxplain: debugging mapreduce job performance. Proc. VLDB Endow. 5(7), 598–609 (2012)
Article Google Scholar
Mackinlay, J.D., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. IEEE Trans. Vis. Comput. Graph. 13(6), 1137–1144 (2007)
Article Google Scholar
Meliou, A., Gatterbauer, W., Halpern, J.Y., Koch, C., Moore, K.F., Suciu, D.: Causality in databases. IEEE Data Eng. Bull. 33(3), 59–67 (2010)
Google Scholar
Roy, S., Orr, L., Suciu, D.: Explaining query answers with explanation-ready databases. Proc. VLDB Endow. 9(4), 348–359 (2015)
Article Google Scholar
Roy, S., Suciu, D.: A formal approach to finding explanations for database queries. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014, pp. 1579–1590 (2014)
Google Scholar
Saltelli, A.: The critique of modelling and sensitivity analysis in the scientific discourse. TAUC (2006)
Google Scholar
Sarawagi, S.: Explaining differences in multidimensional aggregates. In: Proceedings of the 25th International Conference on Very Large Data Bases, VLDB 1999, pp. 42–53. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Google Scholar
Sathe, G., Sarawagi, S.: Intelligent rollups in multidimensional olap data. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 531–540. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Wu, E., Madden, S.: Scorpion: explaining away outliers in aggregate queries. Proc. VLDB Endow. 6(8), 553–564 (2013). https://doi.org/10.14778/2536354.2536356
Article Google Scholar

Download references

Author information

Authors and Affiliations

The University of Queensland, Brisbane, Australia
Ibrahim A. Ibrahim, Xue Li, Xin Zhao, Sanad Al Maskari, Abdullah M. Albarrak & Yanjun Zhang

Authors

Ibrahim A. Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Xue Li
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Sanad Al Maskari
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah M. Albarrak
View author publications
You can also search for this author in PubMed Google Scholar
Yanjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ibrahim A. Ibrahim .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Dinh Phung
National Chiao Tung University, Hsinchu City, Taiwan
Vincent S. Tseng
Monash University, Clayton, Victoria, Australia
Geoffrey I. Webb
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Bao Ho
University of Melbourne, Melbourne, Victoria, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, Victoria, Australia
Lida Rashidi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ibrahim, I.A., Li, X., Zhao, X., Maskari, S.A., Albarrak, A.M., Zhang, Y. (2018). Automated Explanations of User-Expected Trends for Aggregate Queries. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_48

Download citation

DOI: https://doi.org/10.1007/978-3-319-93034-3_48
Published: 19 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93033-6
Online ISBN: 978-3-319-93034-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics