tutorial

Design and Implementation of Multi-Threaded Algorithms in Polynomial Algebra

Author:

Marc Moreno MazaAuthors Info & Claims

ISSAC '21: Proceedings of the 2021 International Symposium on Symbolic and Algebraic Computation

Pages 15 - 20

https://doi.org/10.1145/3452143.3465511

Published: 18 July 2021 Publication History

Get Access

References

[1]

Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company.

Google Scholar

[2]

M. Asadi, A. Brandt, C. Chen, S. Covanov, F. Mansouri, D. Mohajerani, R. H. C. Moir, M. Moreno Maza, D. Talaashrafi, Linxiao Wang, Ning Xie, and Yuzhen Xie. 2021. Basic Polynomial Algebra Subprograms (BPAS). www.bpaslib.org.

Google Scholar

[3]

Mohammadali Asadi, Alexander Brandt, Robert H. C. Moir, and Marc Moreno Maza. 2019. Algorithms and Data Structures for Sparse Polynomial Arithmetic. Mathematics 7, 5 (2019), 441.

Crossref

Google Scholar

[4]

Mohammadali Asadi, Alexander Brandt, Robert H. C. Moir, Marc Moreno Maza, and Yuzhen Xie. 2021. Parallelization of Triangular Decompositions: Techniques and Implementation. J. Symb. Comput. (2021). (to appear).

Google Scholar

[5]

Giuseppe Attardi and Carlo Traverso. 1996. Strategy-Accurate Parallel Buchberger Algorithms. J. Symbolic Computation 22 (1996), 1--15.

Google Scholar

[6]

Cédric Bastoul. 2004. Code Generation in the Polyhedral Model Is Easier Than You Think. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04). IEEE Computer Society, 7--16.

Digital Library

Google Scholar

[7]

Laszlo A. Belady. 1966. A Study of replacement algorithms for virtual storage computers. IBM Systems Journal, 5:78--101 (1966).

Digital Library

Google Scholar

[8]

Mordechai Ben-Ari. 1990. Principles of concurrent and distributed programming. Prentice Hall.

Google Scholar

[9]

Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. 2017. Julia: A fresh approach to numerical computing. SIAM review 59, 1 (2017), 65--98.

Digital Library

Google Scholar

[10]

François Boulier. 1994. Étude et implantation de quelques algorithmes en algèbre différentielle. (Study and implementation of some algorithms in differential algebra). Ph.D. Dissertation. Lille University of Science and Technology, France.

Google Scholar

[11]

Russell J. Bradford. 1990. A parallelization of the Buchberger algorithm. In International Symposium on Symbolic and Algebraic Computation (ISSAC '90). ACM, 296.

Digital Library

Google Scholar

[12]

Alexander Brandt. 2018. High Performance Sparse Multivariate Polynomials: Fundamental Data Structures and Algorithms. Master's thesis. Western University.

Google Scholar

[13]

Alexander Brandt and Marc Moreno Maza. 2021. On the Complexity and Parallel Implementation of Hensel's Lemma and Weierstrass Preparation. In Computer Algebra in Scientific Computing (CASC '21). (submitted).

Digital Library

Google Scholar

[14]

Bruno Buchberger. 1987. The parallelization of critical-pair/completion procedures on the L-Machine. In Proceedings of the Japanese Symposium on functional programming. 54--61.

Google Scholar

[15]

Changbo Chen, Svyatoslav Covanov, Farnam Mansouri, Marc Moreno Maza, Ning Xie, and Yuzhen Xie. 2016. Parallel Integer Polynomial Multiplication. In 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2016, Timisoara, Romania, September 24-27, 2016. 72--80.

Google Scholar

[16]

Changbo Chen and Marc Moreno Maza. 2016. Quantifier elimination by cylindrical algebraic decomposition based on regular chains. J. Symb. Comput. 75 (2016), 74--93.

Digital Library

Google Scholar

[17]

Changbo Chen and Marc Moreno Maza. 2012. Algorithms for computing triangular decomposition of polynomial systems. J. Symb. Comput. 47, 6 (2012), 610--642.

Digital Library

Google Scholar

[18]

Changbo Chen, Marc Moreno Maza, and Yuzhen Xie. 2011. Cache Complexity and Multicore Implementation for Univariate Real Root Isolation. J. of Physics: Conference Series 341 (2011), 12.

Google Scholar

[19]

Muhammad F. I. Chowdhury, Marc Moreno Maza, Wei Pan, and Éric Schost. 2011. Complexity and Performance Results for non FFT-based Univariate Polynomial Multiplication. In Proceedings of Advances in mathematical and computational methods: addressing modern of science, technology, and society, AIP conference proceedings (volume 1368). 259--262.

Crossref

Google Scholar

[20]

Svyatoslav Covanov, Davood Mohajerani, Marc Moreno Maza, and Lin-Xiao Wang. 2019. Big Prime Field FFT on Multi-core Processors. In International Symposium on Symbolic and Algebraic Computation (ISSAC '19), Beijing, China, July 15-18, 2019. 106--113.

Google Scholar

[21]

Jean Della Dora and John Fitch (Eds.). 1988. Computer Algebra and Parallelism, First International Workshop, Grenoble, France, September 1989. Academic Press.

Google Scholar

[22]

Jean-Guillaume Dumas, Erich L. Kaltofen, and Clément Pernet (Eds.). 2015. Proceedings of the 2015 International Workshop on Parallel Symbolic Computation, PASCO 2015, Bath, United Kingdom, July 10-12, 2015. ACM.

Digital Library

Google Scholar

[23]

Howard Whitley Eves. 1980. Elementary matrix theory. Courier Corporation.

Google Scholar

[24]

Stijn Eyerman, James E. Smith, and Lieven Eeckhout. 2006. Characterizing the branch misprediction penalty. In 2006 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2006, March 19-21, 2006, Austin, Texas, USA, Proceedings. IEEE Computer Society, 48--58.

Crossref

Google Scholar

[25]

Jean-Charles Faugère and Sylvain Lachartre. 2010. Parallel Gaussian elimination for Gröbner bases computations in finite fields. In Proceedings of the 4th International Workshop on Parallel Symbolic Computation. ACM, 89--97.

Digital Library

Google Scholar

[26]

Jean-Charles Faugère, Michael B. Monagan, and Hans-Wolfgang Loidl (Eds.). 2017. Proceedings of the 2017 International Workshop on Parallel Symbolic Computation, PASCO 2017, Kaiserslautern, Germany, July 23-24, 2017. ACM.

Google Scholar

[27]

Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. 2012. Cache-Oblivious Algorithms. ACM Transactions on Algorithms 8, 1 (2012).

Digital Library

Google Scholar

[28]

Matteo Frigo and Volker Strumpen. 2006. The cache complexity of multithreaded cache oblivious algorithms. In Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2006. ACM, 271--280.

Digital Library

Google Scholar

[29]

Mickaël Gastineau and Jacques Laskar. 2013. Highly Scalable Multiplication for Distributed Sparse Multivariate Polynomials on Many-Core Systems. In Computer Algebra in Scientific Computing (CASC '13) (LNCS), Vol. 8136. Springer, 100--115.

Google Scholar

[30]

Mickaël Gastineau and Jacques Laskar. 2015. Parallel sparse multivariate polynomial division. In Proceedings of the 2015 International Workshop on Parallel Symbolic Computation, PASCO 2015. ACM, 25--33.

Digital Library

Google Scholar

[31]

Armin Größlinger, Martin Griebl, and Christian Lengauer. 2006. Quantifier elimination in automatic loop parallelization. J. Symb. Comput. 41, 11 (2006), 1206--1221.

Crossref

Google Scholar

[32]

Sardar Anisul Haque, Marc Moreno Maza, and Ning Xie. 2014. A Many-core Machine Model for Designing Algorithms with Minimum Parallelism Overheads. CoRR abs/1402.0264 (2014). arXiv:1402.0264 http://arxiv.org/abs/1402.0264

Google Scholar

[33]

Sardar Anisul Haque, Marc Moreno Maza, and Ning Xie. 2015. A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads. In Parallel Computing: On the Road to Exascale, Proceedings of the International Conference on Parallel Computing, ParCo 2015, 1-4 September 2015, Edinburgh, Scotland, UK (Advances in Parallel Computing), Vol. 27. IOS Press, 35--44.

Google Scholar

[34]

John L. Hennessy and David A. Patterson. 2012. Computer Architecture - A Quantitative Approach, 5th Edition. Morgan Kaufmann.

Google Scholar

[35]

Karin Högstedt, Larry Carter, and Jeanne Ferrante. 1997. Determining the idle time of a tiling. In Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (POPL '97). ACM, 160--173.

Digital Library

Google Scholar

[36]

Hoon Hong (Ed.). 1994. Proceedings of the 1st International Workshop on Parallel Symbolic Computation, PASCO 1994, Linz, Austria. World scientific.

Google Scholar

[37]

Hoon Hong, Erich Kaltofen, and Markus A. Hitz (Eds.). 1997. Proceedings of the 2nd International Workshop on Parallel Symbolic Computation, PASCO 1997, July 20--22, 1997, Kihei, Hawaii, USA. ACM.

Google Scholar

[38]

Sunpyo Hong and Hyesoon Kim. 2009. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA 2009. 152--163.

Google Scholar

[39]

Monica S. Lam, Edward E. Rothberg, and Michael E. Wolf. 1991. The Cache Performance and Optimizations of Blocked Algorithms. In ASPLOS-IV Proceedings. ACM Press, 63--74.

Google Scholar

[40]

Charles E. Leiserson. 2011. Cilk. In Encyclopedia of Parallel Computing. 273--288.

Google Scholar

[41]

Lin Ma, Kunal Agrawal, and Roger D. Chamberlain. 2014. A memory access model for highly-threaded many-core architectures. Future Generation Comp. Syst. 30 (2014), 202--215.

Digital Library

Google Scholar

[42]

Marc Moreno Maza and Yuzhen Xie. 2009. FFT-Based Dense Polynomial Arithmetic on Multi-cores. In High Performance Computing Systems and Applications, 23rd International Symposium, HPCS 2009 (LNCS), Vol. 5976. Springer, 378--399.

Google Scholar

[43]

Michael McCool, James Reinders, and Arch Robison. 2012. Structured parallel programming: patterns for efficient computation. Elsevier.

Digital Library

Google Scholar

[44]

Michael B. Monagan and Roman Pearce. 2009. Parallel sparse polynomial multiplication using heaps. In International Symposium on Symbolic and Algebraic Computation, (ISSAC '09), Seoul, Republic of Korea, July 29-31, 2009. ACM.

Google Scholar

[45]

Michael B. Monagan and Roman Pearce. 2011. Sparse polynomial division using a heap. J. Symb. Comput. 46, 7 (2011), 807--822.

Digital Library

Google Scholar

[46]

Marc Moreno Maza. 1999. On Triangular Decompositions of Algebraic Varieties. Technical Report TR 4/99. NAG Ltd, Oxford, UK. Presented at the MEGA-2000 Conference, Bath, England. http://www.csd.uwo.ca/?moreno.

Google Scholar

[47]

Marc Moreno Maza and Jean-Louis Roch (Eds.). 2010. Proceedings of the 4th International Workshop on Parallel Symbolic Computation, PASCO 2010, July 21-23, 2010, Grenoble, France. ACM.

Google Scholar

[48]

Marc Moreno Maza and Stephen M. Watt (Eds.). 2007. Proceedings of the 3rd International Workshop on Parallel Symbolic Computation, PASCO 2007, 27-28 July 2007, London, Ontario, Canada. ACM.

Digital Library

Google Scholar

[49]

Marc Moreno Maza and Yuzhen Xie. 2011. Balanced Dense Polynomial Multiplication on Multi-Cores. Int. J. Found. Comput. Sci. 22, 5 (2011), 1035--1055.

Crossref

Google Scholar

[50]

John Nickolls, Ian Buck, Michael Garland, and Kevin Skadron. 2008. Scalable Parallel Programming with CUDA. Queue 6, 2 (2008), 40--53.

Digital Library

Google Scholar

[51]

Naila Rahman and Rajeev Raman. 2000. Analysing Cache Effects in Distribution Sorting. ACM J. Exp. Algorithmics 5 (2000), 14.

Digital Library

Google Scholar

[52]

John E. Savage. 1998. Models of computation - exploring the power of computing. Addison-Wesley.

Google Scholar

[53]

Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, and Richard W. Vuduc. 2012. A performance analysis framework for identifying potential benefits in GPGPU applications. In PPOPP 2012, New Orleans, LA, USA, February 25--29, 2012.

Google Scholar

[54]

Charles Van Loan. 1992. Computational frameworks for the fast Fourier transform. SIAM.

Google Scholar

[55]

Joachim von zur Gathen and Jürgen Gerhard. 2013. Modern Computer Algebra (3. ed.). Cambridge University Press.

Google Scholar

[56]

Wen-tsun Wu. 1987. A zero structure theorem for polynomial-equations-solving and its applications. In EUROCAL '87, Proceedings (LNCS), Vol. 378. Springer, 44.

Google Scholar

[57]

Richard Zippel (Ed.). 1992. Computer Algebra and Parallelism, Second International Workshop, Ithaca, USA, May 9--11, 1990. LNCS, Vol. 584. Springer

Google Scholar

Cited By

View all

van der Hoeven JMaza MZhi L(2022)On the Complexity of Symbolic ComputationProceedings of the 2022 International Symposium on Symbolic and Algebraic Computation10.1145/3476446.3535493(3-12)Online publication date: 4-Jul-2022
https://dl.acm.org/doi/10.1145/3476446.3535493

Index Terms

Design and Implementation of Multi-Threaded Algorithms in Polynomial Algebra
1. Computing methodologies
2. Mathematics of computing
  1. Mathematical software
    1. Mathematical software performance

Recommendations

Implementing a 3D multigrid algorithm on Fujitsu's vector parallel supercomputer
PAS '95: Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis

Multigrid algorithms are well adopted by many engineering applications programs on the traditional sequential computers due to their good convergence behavior. In order to convert those existing resources to the field of parallel computing in a timely ...
Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures
HIPS '03: Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03)

Exploiting Thread-Level Parallelism (TLP) is a promisingway to improve the performance of applications with theadvent of general-purpose cost effective uni-processor andshared-memory multiprocessor systems. In this paper, wedescribe the OpenMP ...
A Latency-Hiding MIMD Wavelet Transform
PDP '96: Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)

Abstract: The discrete wavelet transform (DWT) may be used for applications in which real time execution is critical but data sizes are very large. Real-time execution can only be achieved through a parallel implementation. Published parallel ...

Comments

Information & Contributors

Information

Published In

ISSAC '21: Proceedings of the 2021 International Symposium on Symbolic and Algebraic Computation

July 2021

379 pages

ISBN:9781450383820

DOI:10.1145/3452143

General Chair:
Frédéric Chyzak
Inria, France
,
Program Chair:
George Labahn
University of Waterloo, Canada

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial

Conference

ISSAC '21

Sponsor:

SIGSAM

ISSAC '21: International Symposium on Symbolic and Algebraic Computation

July 18 - 23, 2021

Virtual Event, Russian Federation

Acceptance Rates

Overall Acceptance Rate 395 of 838 submissions, 47%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
64
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

van der Hoeven JMaza MZhi L(2022)On the Complexity of Symbolic ComputationProceedings of the 2022 International Symposium on Symbolic and Algebraic Computation10.1145/3476446.3535493(3-12)Online publication date: 4-Jul-2022
https://dl.acm.org/doi/10.1145/3476446.3535493

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

References

Cited By

Index Terms

Recommendations

Implementing a 3D multigrid algorithm on Fujitsu's vector parallel supercomputer

Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures

A Latency-Hiding MIMD Wavelet Transform

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations