skip to main content
10.1145/3269206.3271744acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

DiVE: Diversifying View Recommendation for Visual Data Exploration

Published: 17 October 2018 Publication History

Abstract

To support effective data exploration, there has been a growing interest in developing solutions that can automatically recommend data visualizations that reveal interesting and useful data-driven insights. In such solutions, a large number of possible data visualization views are generated and ranked according to some metric of importance (e.g., a deviation-based metric), then the top-k most important views are recommended. However, one drawback of that approach is that it often recommends similar views, leaving the data analyst with a limited amount of gained insights. To address that limitation, in this work we posit that employing diversification techniques in the process of view recommendation allows eliminating that redundancy and provides a good and concise coverage of the possible insights to be discovered. To that end, we propose a hybrid objective utility function, which captures both the importance, as well as the diversity of the insights revealed by the recommended views. While in principle, traditional diversification methods (e.g., Greedy Construction) provide plausible solutions under our proposed utility function, they suffer from a significantly high query processing cost. In particular, directly applying such methods leads to a "process-first-diversify-next" approach, in which all possible data visualization are generated first via executing a large number of aggregate queries. To address that challenge, we propose an integrated scheme called DiVE, which efficiently selects the top-k recommended view based on our hybrid utility function. DiVE leverages the properties of both the importance and diversity metrics to prune a large number of query executions without compromising the quality of recommendations. Our experimental evaluation on real datasets shows the performance gains provided by DiVE.

References

[1]
A. M. Albarrak and M. A. Sharaf. 2017. Efficient schemes for similarity-aware refinement of aggregation queries. World Wide Web, Vol. 20, 6 (2017), 1237--1267.
[2]
C. L. A. Clarke et almbox. 2008. Novelty and diversity in information retrieval evaluation. In SIGIR.
[3]
M. Drosou and E. Pitoura. 2010. Search result diversification. SIGMOD Record, Vol. 39, 1 (2010), 41--47.
[4]
H. Ehsan et almbox. 2016. MuVE: Efficient Multi-Objective View Recommendation for Visual Data Exploration. In ICDE.
[5]
H. Ehsan et almbox. 2018. Efficient Recommendation of Aggregate Data Visualizations. TKDE, Vol. 30, 2 (2018), 263--277.
[6]
R. Fagin et almbox. 2003. Comparing top k lists. In ACM-SIAM.
[7]
Y. Hu et almbox. 2009. Estimating aggregates in time-constrained approximate queries in Oracle. In EDBT.
[8]
Z. Hussain et almbox. 2015. Diversifying with Few Regrets, But too Few to Mention. In ExploreDB.
[9]
I. F. Ilyas et almbox. 2008. A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv., Vol. 40, 4 (2008), 11:1--11:58.
[10]
S. Kandel et almbox. 2012. Profiler: integrated statistical analysis and visualization for data quality assessment. In AVI.
[11]
V. Kantere. 2016. Query Similarity for Approximate Query Answering. In DEXA .
[12]
V. Kantere et almbox. 2015. Query Relaxation across Heterogeneous Data Sources. In CIKM .
[13]
A. Key et almbox. 2012. VizDeck: self-organizing dashboards for visual analytics. In SIGMOD.
[14]
H. A. Khan and M. A. Sharaf. 2015. Progressive diversification for column-based data exploration platforms. In ICDE.
[15]
D. Rafiei et almbox. 2010. Diversifying web search results. In WWW.
[16]
T. Sellam et almbox. 2016. Ziggy: Characterizing Query Results for Data Explorers. PVLDB, Vol. 9, 13 (2016), 1473--1476.
[17]
T. Sellam and M. L. Kersten. 2016. Fast, Explainable View Detection to Characterize Exploration Queries. In SSDBM .
[18]
J. Seo and B. Shneiderman. 2006. Knowledge Discovery in High-Dimensional Data: Case Studies and a User Survey for the Rank-by-Feature Framework. TVGC, Vol. 12, 3 (2006), 311--322.
[19]
B. Smyth et almbox. 2001. Similarity vs. Diversity. In ICCBR.
[20]
Q. T. Tran and C. Y. Chan. 2010. How to ConQueR why-not questions. In SIGMOD .
[21]
M. Vartak et almbox. 2014. SEEDB: Automatically Generating Query Visualizations. PVLDB, Vol. 7, 13 (2014), 1581--1584.
[22]
M. Vartak et almbox. 2015. SEEDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics. PVLDB, Vol. 8, 13 (2015), 2182--2193.
[23]
F. B. Viegas et almbox. 2007. Many Eyes: A site for visualization at internet scale . TVGC (2007), 1121--1128.
[24]
M. R. Vieira et almbox. 2011. On query result diversification. In ICDE.
[25]
E. Wu et almbox. 2014. The Case for Data Visualization Management Systems. PVLDB, Vol. 7, 10 (2014), 903--906.
[26]
C. Yu et almbox. 2009. It takes variety to make a world: diversification in recommender systems. In EDBT.
[27]
M. Zhang and N. Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In RecSys .

Cited By

View all
  • (2024)Calliope-Net: Automatic Generation of Graph Data Facts via Annotated Node-Link DiagramsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332692530:1(562-572)Online publication date: 1-Jan-2024
  • (2024)AVA: An automated and AI-driven intelligent visual analytics frameworkVisual Informatics10.1016/j.visinf.2024.06.0028:2(106-114)Online publication date: Jun-2024
  • (2023)Efficient Diversification for Recommending Aggregate Data VisualizationsIEEE Access10.1109/ACCESS.2023.328345711(62261-62280)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. DiVE: Diversifying View Recommendation for Visual Data Exploration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
    October 2018
    2362 pages
    ISBN:9781450360142
    DOI:10.1145/3269206
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data diversification
    2. data exploration
    3. visual analytics

    Qualifiers

    • Research-article

    Funding Sources

    • Advanced Queensland Research Grant
    • Indonesia Endowment Fund for Education (LPDP)

    Conference

    CIKM '18
    Sponsor:

    Acceptance Rates

    CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Calliope-Net: Automatic Generation of Graph Data Facts via Annotated Node-Link DiagramsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332692530:1(562-572)Online publication date: 1-Jan-2024
    • (2024)AVA: An automated and AI-driven intelligent visual analytics frameworkVisual Informatics10.1016/j.visinf.2024.06.0028:2(106-114)Online publication date: Jun-2024
    • (2023)Efficient Diversification for Recommending Aggregate Data VisualizationsIEEE Access10.1109/ACCESS.2023.328345711(62261-62280)Online publication date: 2023
    • (2022)VisGuide: User-Oriented Recommendations for Data Event ExtractionProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517648(1-13)Online publication date: 29-Apr-2022
    • (2022)ComputableViz: Mathematical Operators as a Formalism for Visualisation Processing and AnalysisProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517618(1-15)Online publication date: 29-Apr-2022
    • (2022)AI4VIS: Survey on Artificial Intelligence Approaches for Data VisualizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.309900228:12(5049-5070)Online publication date: 1-Dec-2022
    • (2022)Recommending View Bundles in Data Marketplaces2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53654.2022.9945110(3403-3408)Online publication date: 9-Oct-2022
    • (2022)A generic framework for efficient computation of top-k diverse resultsThe VLDB Journal10.1007/s00778-022-00770-032:4(737-761)Online publication date: 28-Nov-2022
    • (2021)VizGRank: A Context-Aware Visualization Recommendation Method Based on Inherent Relations Between VisualizationsDatabase Systems for Advanced Applications10.1007/978-3-030-73200-4_16(244-261)Online publication date: 6-Apr-2021
    • (2020)User-oriented Generation of Contextual Visualization SequencesExtended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3334480.3383057(1-8)Online publication date: 25-Apr-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media