research-article

A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP

Authors:
Immanuel Trummer

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Yicheng Wang

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Saketh Mahankali

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataJune 2019Pages 936–953https://doi.org/10.1145/3299869.3300089

Published:25 June 2019Publication History

SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

Pages 936–953

ABSTRACT

We focus on the problem of answering OLAP queries via voice output. We present a holistic approach that combines query processing and result vocalization. We use the following key ideas to minimize processing overheads and maximize answer quality. First, our approach samples from the database to evaluate alternative speech fragments. OLAP queries are not fully evaluated. Instead, sampling focuses on result aspects that are relevant for voice output. To guide sampling, we rely on methods from the area of Monte-Carlo Tree Search. Second, we use pipelining to interleave query processing and voice output. The system starts providing the user with high-level insights while generating more fine-grained results in the background. Third, we optimize speech output to maximize the user's information gain under speaking time constraints. We use a maximum-entropy model to predict the user's belief about OLAP results, after listening to voice output. Based on that model, we select the most informative speech fragments (i.e., the ones minimizing the distance between user belief and actual data). We analyze formal properties of the proposed speech structure and analyze complexity of our algorithm. Also, we compare alternative vocalization approaches in an extensive user study.

References

CB Browne and Edward Powley. 2012. A survey of monte carlo tree search methods . Trans. on Computational Intelligence and AI in Games, Vol. 4, 1 (2012), 1--49. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6145622Google ScholarCross Ref
Dharmil Chandarana, Vraj Shah, Arun Kumar, and Lawrence Saul. 2017. SpeakQL: towards speech-driven multi-modal querying. In HILDA. 1--6.Google Scholar
Yu Feng and Shan Wang. 2002. Compressed data cube for approximate OLAP query processing . Journal of Computer Science and Technology, Vol. 17, 5 (2002), 625--635.Google ScholarDigital Library
Sylvain Gelly, L Kocsis, and Marc Schoenauer. 2012. The grand challenge of computer go: monte carlo tree search and extensions . Commun. ACM, Vol. 3 (2012), 106--113. http://dl.acm.org/citation.cfm?id=2093574 Google ScholarDigital Library
Google. {n. d.}. Google Assistant SDK . https://developers.google.com/assistant/sdk/overview.Google Scholar
Silviu Guiasu and Abe Shenitzer. 1985. The principle of maximum entropy . The Mathematical Intelligencer, Vol. 7, 1 (1985), 42--48.Google ScholarCross Ref
Thomas Hermann, Andy Hunt, and John G Neuhoff. 2011. The Sonification Handbook. 301--324 pages. arxiv: arXiv:1011.1669v3Google Scholar
Ruoming Jin, Leo Glimcher, Chris Jermaine, and Gagan Agrawal. 2006. New sampling-based estimators for OLAP queries . In ICDE. 18. Google ScholarDigital Library
Manas Joglekar, Hector Garcia-molina, and Aditya Parameswaran. 2015. Smart drill down . VLDBJ, Vol. 8, 12 (2015), 1928--1931. arxiv: arXiv:1412.0364v1Google ScholarDigital Library
Shantanu Joshi and Christopher Jermaine. 2008. Materialized sample views for database approximation . ICDE, Vol. 20, 3 (2008), 337--351. Google ScholarDigital Library
Uwe Jugel, Zbigniew Jerzak, and Gregor Hackenbroich. 2014. M4 : A Visualization-Oriented Time Series Data Aggregation . VLDB, Vol. 7, 10 (2014), 797--808. Google ScholarDigital Library
Niranjan Kamat and Arnab Nandi. 2017. InfiniViz: Interactive Visual Exploration using Progressive Bin Refinement . arXiv preprint arXiv:1710.01854 (2017). arxiv: 1710.01854 http://arxiv.org/abs/1710.01854Google Scholar
Albert Kim, Eric Blais, Aditya Parameswaran, Piotr Indyk, Sam Madden, and Ronitt Rubinfeld. 2015. Rapid sampling for visualizations with ordering guarantees . VLDB, Vol. 8, 5 (2015), 521--532. arxiv: 1412.3040 Google ScholarDigital Library
Levente Kocsis and C Szepesvá ri. 2006. Bandit based monte-carlo planning. In European Conf. on Machine Learning . 282--293. http://www.springerlink.com/index/D232253353517276.pdf Google ScholarDigital Library
Xiaolei Li, Jiawei Han, Zhijun Yin, Jae-Gil Lee, and Yizhou Sun. 2008. Sampling cube: a framework for statistical olap over sampling data. In SIGMOD . 779--790. Google ScholarDigital Library
Zhicheng Liu and Jeffrey Heer. 2014. The effects of interactive latency on exploratory visual analysis . IEEE Transactions on Visualization & Computer Graphics, Vol. 20, 12 (2014), 2122--2131.Google ScholarCross Ref
Gabriel Lyons, Vinh Tran, Carsten Binnig, Ugur Cetintemel, and Tim Kraska. 2016. Making the case for Query-by-Voice with EchoQuery. In SIGMOD . 2129--2132. Google ScholarDigital Library
Patrick Marcel, Place Jean Jaurè s, and Stefano Rizzi. 2012. Towards intensional answers to OLAP queries for analytical sessions. In DOLAP . 49--56. Google ScholarDigital Library
Robert B. Miller. 1968. Response time in man-computer conversational transactions. In AFIPS . 267--277. Google ScholarDigital Library
Navneet Potti and Jignesh M. Patel. 2015. DAQ: A new paradigm for approximate query processing . VLDB, Vol. 8, 9 (2015), 898--909. Google ScholarDigital Library
Rameshsharma Ramloll, Wai Yu, and Beate Riedel. 2001. Using non-speech sounds to improve access to 2D tabular numerical information for visually impaired users. In Conference of the British HCI Group . 515--529. http://eprints.gla.ac.uk/3223/Google ScholarCross Ref
S. Sarawagi. 2000. User-adaptive exploration of multidimensional data. In VLDB. 307--316. http://citeseer.ist.psu.edu/sarawagi00useradaptive.htmlGoogle Scholar
Ben Shneiderman. 1984. Response time and display rate in human performance with computers . Comput. Surveys, Vol. 16, 3 (1984), 265--285. Google ScholarDigital Library
Immanuel Trummer, Mark Bryan, and Ramya Narasimha. 2018. Vocalizing large time series efficiently. In VLDB. 1--12. Google ScholarDigital Library
Immanuel Trummer, Jiancheng Zhu, and Mark Bryan. 2017. Data vocalization: optimizing voice output of relational data . VLDB, Vol. 10, 11 (2017), 1574--1585. Google ScholarDigital Library

Index Terms

A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP

Recommendations

A segment-based approach to voice conversion
ICASSP '91: Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

A voice conversion algorithm that uses speech segments as conversion units is proposed. Input speech is decomposed into speech segments by a speech recognition module, and the segments are replaced by speech segments uttered by another speaker. This ...
Read More
Feature-based recommendation framework on OLAP
ADC '12: Proceedings of the Twenty-Third Australasian Database Conference - Volume 124

The queries in Online Analytical Processing (OLAP) are user-guided. OLAP is based on a multidimensional data model for complex analytical and ad-hoc queries with a rapid execution time. Those queries are either routed or on-demand revolved around the ...
Read More
Voice conversion by mapping the speaker-specific features using pitch synchronous approach

The basic goal of the voice conversion system is to modify the speaker-specific characteristics, keeping the message and the environmental information contained in the speech signal intact. Speaker characteristics reflect in speech at different levels, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data
June 2019
2106 pages
ISBN:9781450356435
DOI:10.1145/3299869
General Chairs:
Peter Boncz
CWI & Vrije Universiteit Amsterdam, The Netherlands
,
Stefan Manegold
CWI & Universiteit Leiden, The Netherlands
,
Program Chairs:
Anastasia Ailamaki
EPFL, Switzerland
,
Amol Deshpande
University of Maryland, USA
,
Tim Kraska
MIT, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
olap
vocalization
voice-based interfaces
voice-output
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '19 Paper Acceptance Rate88of430submissions,20%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 340
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP

SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

A segment-based approach to voice conversion

Feature-based recommendation framework on OLAP

Voice conversion by mapping the speaker-specific features using pitch synchronous approach