short-paper

Demonstrating CAESURA: Language Models as Multi-Modal Query Planners

Authors:

Matthias Urban,

Carsten BinnigAuthors Info & Claims

SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data

Pages 472 - 475

https://doi.org/10.1145/3626246.3654732

Published: 09 June 2024 Publication History

Get Access

Abstract

In many domains, multi-modal data takes an important role and modern question-answering systems based on LLMs allow users to query this data using simple natural language queries. Retrieval Augmented Generation (RAG) is a recent approach that extends Large Language Models (LLM) with database technology to enable such multi-modal QA systems. In RAG, relevant data is first retrieved from a vector database and then fed into an LLM that computes the query result. However, RAG-based approaches have severe issues, such as regarding efficiency and scalability, since LLMs have high inference costs and can only process limited amounts of data. Therefore, in this demo paper, we propose CAESURA, a database-first approach that extends databases with LLMs. The main idea is that CAESURA utilizes the reasoning capabilities of LLMs to translate natural language queries into execution plans. Using such execution plans allows CAESURA to process multi-modal data outside the LLM using query operators and optimization strategies that are footed in scalable query execution strategies of databases. Our demo allows users to experience CAESURA on two example datasets containing tables, texts, and images1.

Supplemental Material

MP4 File

Presentation video of CAESURA

Download
154.14 MB

References

[1]

Gemini Team at Google. 2023. Gemini: A Family of Highly Capable Multimodal Models. arxiv: 2312.11805 [cs]

Google Scholar

[2]

Zui Chen, Zihui Gu, Lei Cao, Ju Fan, Sam Madden, and Nan Tang. 2023. Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes. (2023).

Google Scholar

[3]

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The Faiss library. (2024). arxiv: 2401.08281 [cs.LG]

Google Scholar

[4]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rockt"aschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 9459--9474.

Google Scholar

[5]

OpenAI. 2023. GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774 arxiv: 2303.08774 [cs]

Crossref

Google Scholar

[6]

Matthias Urban and Carsten Binnig. 2024. CAESURA: Language Models as Multi-Modal Query Planners. In 14th Conference on Innovative Data Systems Research, CIDR 2024, Chaminade, CA, USA, January 14--17, 2024. www.cidrdb.org. https://www.cidrdb.org/cidr2024/papers/p14-urban.pdf

Digital Library

Google Scholar

[7]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. Chain of Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs] (April 2022). arxiv: 2201.11903 [cs]

Google Scholar

[8]

Sam Wiseman, Stuart M. Shieber, and Alexander M. Rush. 2017. Challenges in Data-to-Document Generation. arXiv:1707.08052 [cs] (July 2017). arxiv: 1707.08052 [cs]

Google Scholar

[9]

Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, and Nan Duan. 2023. Visual ChatGPT : Talking, Drawing and Editing with Visual Foundation Models. arxiv: 2303.04671 [cs]

Google Scholar

[10]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. arxiv: 2210.03

Google Scholar

Cited By

View all

Urban MBinnig C(2024)ELEET: Efficient Learned Query Execution over Text and TablesProceedings of the VLDB Endowment10.14778/3704965.370498917:13(4867-4880)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.14778/3704965.3704989
Zhang YHenkel JFloratou ACahoon JDeep SPatel J(2024)ReAcTable: Enhancing ReAct for Table Question AnsweringProceedings of the VLDB Endowment10.14778/3659437.365945217:8(1981-1994)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659452

Index Terms

Demonstrating CAESURA: Language Models as Multi-Modal Query Planners
1. Information systems
  1. Data management systems
    1. Database design and models
      1. Data model extensions
        Semi-structured data

Recommendations

Materialized View Selection & View-Based Query Planning for Regular Path Queries
SIGMOD

A regular path query (RPQ) returns node pairs connected by a path whose edge label sequence satisfies the given regular expression. Given a workload of RPQs, selecting the shared subqueries as materialized views to precompute offline can speed up the ...
SPRINTER: A Fast n-ary Join Query Processing Method for Complex OLAP Queries
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

The concept of OLAP query processing is now being widely adopted in various applications. The number of complex queries containing the joins between non-unique keys (called FK-FK joins) increases in those applications. However, the existing in-memory ...
Better Distributed Graph Query Planning With Scouting Queries
GRADES-NDA '23: Proceedings of the 6th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)

Query planning is essential for graph query execution performance. In distributed graph processing, data partitioning and messaging significantly influence performance. However, these aspects are difficult to model analytically, which makes query ...

Comments

Information & Contributors

Information

Published In

SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data

June 2024

694 pages

ISBN:9798400704222

DOI:10.1145/3626246

General Chairs:
Pablo Barcelo
Universidad Catolica, Chile
,
Nayat Sanchez-Pi
INRIA Chile
,
Program Chairs:
Alexandra Meliou
University of Massachusetts Amherst, USA
,
S. Sudarshan
Indian Institute of Technology Bombay

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

BMBF and State of Hesse

Conference

SIGMOD/PODS '24

Sponsor:

SIGMOD

SIGMOD/PODS '24: International Conference on Management of Data

June 9 - 15, 2024

Santiago AA, Chile

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
252
Total Downloads

Downloads (Last 12 months)252
Downloads (Last 6 weeks)52

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Urban MBinnig C(2024)ELEET: Efficient Learned Query Execution over Text and TablesProceedings of the VLDB Endowment10.14778/3704965.370498917:13(4867-4880)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.14778/3704965.3704989
Zhang YHenkel JFloratou ACahoon JDeep SPatel J(2024)ReAcTable: Enhancing ReAct for Table Question AnsweringProceedings of the VLDB Endowment10.14778/3659437.365945217:8(1981-1994)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659452

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Materialized View Selection & View-Based Query Planning for Regular Path Queries

SPRINTER: A Fast n-ary Join Query Processing Method for Complex OLAP Queries

Better Distributed Graph Query Planning With Scouting Queries

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations