skip to main content
research-article

Matrix Query Languages

Published: 02 December 2021 Publication History

Abstract

Due to the importance of linear algebra and matrix operations in data analytics, there has been a renewed interest in developing query languages that combine both standard relational operations and linear algebra operations. We survey aspects of the matrix query language MATLANG and extensions thereof, and connect matrix query languages to classical query languages and arithmetic circuits.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
Mahmoud Abo Khamis, Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich. On functional aggregate queries with additive inequalities. In Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), page 414--431, 2019.
[3]
Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. Faq: Questions asked frequently. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), page 13--28. ACM, 2016.
[4]
Eric Allender. Arithmetic circuits and counting complexity classes. Complexity of Computations and Proofs, Quaderni di Matematica, 13:33--72, 2004.
[5]
Eric Allender, Jia Jiao, Meena Mahajan, and V. Vinay. Non-commutative arithmetic circuits: Depth reduction and size lower bounds. Theor. Comput. Sci., 209(1--2):47--86, 1998.
[6]
Marcelo Arenas, Martin Muñoz, and Cristian Riveros. Descriptive complexity for counting complexity classes. Log. Methods Comput. Sci., 16(1), 2020.
[7]
Sanjeev Arora and Boaz Barak. Complexity theory: A modern approach, 2009.
[8]
Sheldon Jay Axler. Linear algebra done right, volume 2. Springer, 1997.
[9]
Muhammet Balcilar, Pierre Héroux, Benoit Gaüzère, Pascal Vasseur, Sébastien Adam, and Paul Honeine. Breaking the limits of message passing graph neural networks. In Proceedings of the 38th International Conference on Machine Learning (ICML), volume 139 of Proceedings of Machine Learning Research, pages 599--608. PMLR, 2021.
[10]
Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux. On the expressiveness of LARA: A unified language for linear and relational algebra. In Proceedings of the 23rd International Conference on Database Theory (ICDT), volume 155 of LIPIcs, pages 6:1--6:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
[11]
Pablo Barceló, Egor V. Kostylev, Mikaël Monet, Jorge Pérez, Juan L. Reutter, and Juan-Pablo Silva. The expressive power of graph neural networks as a query language. SIGMOD Rec., 49(2):6--17, 2020.
[12]
Peter Baumann, Andreas Dehmel, Paula Furtado, Roland Ritsch, and Norbert Widmann. The multidimensional database system RasDaMan. SIGMOD Rec., 27(2):575--577, June 1998.
[13]
Matthias Boehm, Arun Kumar, and Jun Yang. Data Management in Machine Learning Systems. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2019.
[14]
Allan Borodin, Joachim von zur Gathem, and John Hopcroft. Fast parallel matrix and GCD computations. In 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982), pages 65--71. IEEE, 1982.
[15]
Robert Brijder, Floris Geerts, Jan Van Den Bussche, and Timmy Weerwag. On the expressive power of query languages for matrices. ACM Trans. Database Syst., 44(4), 2019.
[16]
Robert Brijder, Marc Gyssens, and Jan Van den Bussche. On matrices and K-relations. In Proceedings of the 11th International Symposium on Foundations of Information and Knowledge Systems (FoIKS), volume 12012 of Lecture Notes in Computer Science, pages 42--57. Springer, 2020.
[17]
Jin-yi Cai, Martin Fürer, and Neil Immerman. An optimal lower bound on the number of variables for graph identifications. Comb., 12(4):389--410, 1992.
[18]
Ashok K. Chandra. Programming primitives for database languages. In Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming Languages (POPL, pages 50--62, 1981.
[19]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, 2nd edition, 2001.
[20]
Holger Dell, Martin Grohe, and Gaurav Rattan. Lovász meets weisfeiler and leman. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP), volume 107 of LIPIcs, pages 40:1--40:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.
[21]
Manfred Droste and Paul Gastin. Weighted automata and weighted logics. In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP), volume 3580 of Lecture Notes in Computer Science, pages 513--525, 2005.
[22]
Zdenek Dvorák. On recognizing graphs by numbers of homomorphisms. J. Graph Theory, 64(4):330--342, 2010.
[23]
Floris Geerts. When can matrix query languages 18 SIGMOD Record, September 2021 (Vol. 50, No. 3) discern matrices? In Proceedings of the 23rd International Conference on Database Theory (ICDT), volume 155 of LIPIcs, pages 12:1--12:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
[24]
Floris Geerts. On the expressive power of linear algebra on graphs. Theory Comput. Syst., 65(1):179--239, 2021.
[25]
Floris Geerts, Thomas Muñoz, Cristian Riveros, and Domagoj Vrgo?c. Expressive power of linear algebra query languages. In Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), 2021.
[26]
Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, fourth edition, 2013.
[27]
Erich Grädel and Val Tannen. Semiring provenance for first-order model checking, 2017. http://arxiv.org/abs/1712.01980.
[28]
Todd J. Green, Gregory Karvounarakis, and Val Tannen. Provenance semirings. In Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 31--40. ACM, 2007.
[29]
Martin Grohe. Word2vec, node2vec, graph2vec, x2vec: Towards a theory of vector embeddings of structured data. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), page 1--16. ACM, 2020.
[30]
Lauri Hella, Leonid Libkin, Juha Nurmonen, and Limsoon Wong. Logics with aggregate operators. J. ACM, 48(4):880--907, 2001.
[31]
Dylan Hutchison, Bill Howe, and Dan Suciu. Laradb: A minimalist kernel for linear and relational algebra computation. In Proceedings of the 4th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR). ACM, 2017.
[32]
Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, and Zekai J. Gao. Declarative recursive computation on an RDBMS: Or, why you should use a database for distributed machine learning. Proc. VLDB Endow., 12(7):822--835, 2019.
[33]
Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, and Zekai J. Gao. Declarative recursive computation on an rdbms: Or, why you should use a database for distributed machine learning. SIGMOD Rec., 49(1):43--50, 2020.
[34]
Erich Kaltofen. Greatest common divisors of polynomials given by straight-line programs. Journal of the ACM (JACM), 35(1):231--264, 1988.
[35]
Grigoris Karvounarakis and Todd J. Green. Semiring-annotated data: Queries and provenance? SIGMOD Rec., 41(3):5--14, 2012.
[36]
Leonid Libkin. Elements of Finite Model Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2004.
[37]
Shangyu Luo, Zekai J. Gao, Michael N. Gubanov, Luis Leopoldo Perez, Dimitrije Jankov, and Christopher M. Jermaine. Scalable linear algebra on a relational database system. Commun. ACM, 63(8):93--101, 2020.
[38]
Martin Otto. Bounded Variable Logics and Counting: A Study in Finite Models, volume 9 of Lecture Notes in Logic. Cambridge University Press, 2017.
[39]
Sam Perlis. Theory of matrices. Addison-Wesley Press, 1952.
[40]
Motakuri V. Ramana, Edward R. Scheinerman, and Daniel Ullman. Fractional isomorphism of graphs. Discrete Mathematics, 132(1--3):247--265, 1994.
[41]
Ran Raz. On the complexity of matrix product. SIAM J. Comput., 32(5):1356--1369, 2003.
[42]
Amir Shpilka and Amir Yehudayoff. Arithmetic circuits: A survey of recent results and open questions. Foundations and Trends in Theoretical Computer Science, 5(3--4):207--388, 2010.
[43]
Volker Strassen. Vermeidung von divisionen. Journal für die reine und angewandte Mathematik, 264:184--202, 1973.
[44]
Leslie G Valiant and Sven Skyum. Fast parallel computation of polynomials using few processors. In International Symposium on Mathematical Foundations of Computer Science (MFCS), pages 132--139. Springer, 1981.
[45]
Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, and Chris Jermaine. Tensor relational algebra for machine learning system design. Proc. VLDB Endow., 14(8):1338--1350, 2021.
[46]
Ying Zhang, Martin Kersten, and Stefan Manegold. Sciql: Array data processing inside an RDBMS. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, page 1049--1052. ACM, 2013.

Cited By

View all
  • (2022)Query processing on tensor computation runtimesProceedings of the VLDB Endowment10.14778/3551793.355183315:11(2811-2825)Online publication date: 1-Jul-2022
  • (2022)Functional collection programming with semi-ring dictionariesProceedings of the ACM on Programming Languages10.1145/35273336:OOPSLA1(1-33)Online publication date: 29-Apr-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 50, Issue 3
September 2021
30 pages
ISSN:0163-5808
DOI:10.1145/3503780
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2021
Published in SIGMOD Volume 50, Issue 3

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)2
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Query processing on tensor computation runtimesProceedings of the VLDB Endowment10.14778/3551793.355183315:11(2811-2825)Online publication date: 1-Jul-2022
  • (2022)Functional collection programming with semi-ring dictionariesProceedings of the ACM on Programming Languages10.1145/35273336:OOPSLA1(1-33)Online publication date: 29-Apr-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media