Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study

Rucci, Enzo; De Giusti, Armando; Naiouf, Marcelo

doi:10.1007/978-3-319-75214-3_5

Enzo Rucci¹⁰,
Armando De Giusti¹⁰ &
Marcelo Naiouf¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 790))

Included in the following conference series:

Argentine Congress of Computer Science

567 Accesses
2 Citations
1 Altmetric

Abstract

Manycores are consolidating in HPC community as a way of improving performance while keeping power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architecture. While optimizing applications on CPUs, GPUs and first Xeon Phi’s has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and optimization techniques for these devices. In this work, we selected the Floyd-Warshall algorithm as a representative case study of graph and memory-bound applications. Starting from the default serial version, we show how data, thread and compiler level optimizations help the parallel implementation to reach 338 GFLOPS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal

Asynchronous Parallel Dijkstra’s Algorithm on Intel Xeon Phi Processor

Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on X86 Architectures

References

Green500 Supercomputer Ranking. https://www.green500.org/
Top500 Supercomputer Ranking. https://www.top500.org/
Barnes, T., et al.: Evaluating and optimizing the NERSC workload on knights landing. In: Proceedings of the 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS 2016, Piscataway, NJ, USA, pp. 43–53. IEEE Press (2016)
Google Scholar
Bondhugula, U., Devulapalli, A., Dinan, J., Fernando, J., Wyckoff, P., Stahlberg, E., Sadayappan, P.: Hardware/software integration for FPGA-based all-pairs shortest-paths. In: 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 152–164, April 2006
Google Scholar
Codreanu, V., Rodrguez, J., Saastad, O.W.: Best Practice Guide - Knights Landing (2017). http://www.prace-ri.eu/IMG/pdf/Best-Practice-Guide-Knights-Landing.pdf
Culler, D.E., Gupta, A., Singh, J.P.: Parallel Computer Architecture: A Hardware/Software Approach, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Floyd, R.W.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962)
Article Google Scholar
Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. Lond. Math. Phys. Eng. Sci. 372(2022), 1–14 (2014)
Google Scholar
Haidar, A., Tomov, S., Arturov, K., Guney, M., Story, S., Dongarra, J.: LU, QR, and Cholesky factorizations: programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7, September 2016
Google Scholar
Han, S., Kang, S.: Optimizing all-pairs shortest-path algorithm using vector instructions (2005)
Google Scholar
Hou, K., Wang, H., Feng, W.: Delivering parallel programmability to the masses via the Intel MIC ecosystem: a case study. In: 2014 43rd International Conference on Parallel Processing Workshops, pp. 273–282, September 2014
Google Scholar
Jalali, S., Noroozi, M.: Determination of the optimal escape routes of underground mine networks in emergency cases. Saf. Sci. 47(8), 1077–1082 (2009)
Article Google Scholar
Katz, G.J., Kider Jr., J.T.: All-pairs shortest-paths for large graphs on the GPU. In: Proceedings of the 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, GH 2008, pp. 47–55. Eurographics Association, Aire-la-Ville (2008)
Google Scholar
Khan, P., Konar, G., Chakraborty, N.: Modification of Floyd-Warshall’s algorithm for shortest path routing in wireless sensor networks. In: 2014 Annual IEEE India Conference (INDICON), pp. 1–6, December 2014
Google Scholar
Matsumoto, K., Nakasato, N., Sedukhin, S.G.: Blocked all-pairs shortest paths algorithm for hybrid cpu-gpu system. In: 2011 IEEE International Conference on High Performance Computing and Communications, pp. 145–152, September 2011
Google Scholar
Nakaya, A., Goto, S., Kanehisa, M.: Extraction of correlated gene clusters by multiple graph comparison. Genome Inform. 12, 44–53 (2001)
Google Scholar
Reinders, J., Jeffers, J., Sodani, A.: Intel Xeon Phi Processor High Performance Programming Knights Landing Edition. Morgan Kaufmann Publishers Inc., Boston (2016)
Google Scholar
Rosales, C., Cazes, J., Milfeld, K., Gómez-Iglesias, A., Koesterke, L., Huang, L., Vienne, J.: A comparative study of application performance and scalability on the intel knights landing processor. In: Taufer, M., Mohr, B., Kunkel, J.M. (eds.) ISC High Performance 2016. LNCS, vol. 9945, pp. 307–318. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46079-6_22
Chapter Google Scholar
Venkataraman, G., Sahni, S., Mukhopadhyaya, S.: A blocked all-pairs shortest-paths algorithm. In: Halldórsson, M.M. (ed.) SWAT 2000. LNCS, vol. 1851, pp. 419–432. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44985-X_36
Chapter Google Scholar
Warshall, S.: A theorem on boolean matrices. J. ACM 9(1), 11–12 (1962)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors thank the ArTeCS Group from Universidad Complutense de Madrid for letting use their Xeon Phi KNL system.

Author information

Authors and Affiliations

III-LIDI, CONICET, Facultad de Informática, Universidad Nacional de La Plata, 1900, La Plata, Buenos Aires, Argentina
Enzo Rucci & Armando De Giusti
III-LIDI, Facultad de Informática, Universidad Nacional de La Plata, 1900, La Plata, Buenos Aires, Argentina
Marcelo Naiouf

Authors

Enzo Rucci
View author publications
You can also search for this author in PubMed Google Scholar
Armando De Giusti
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Naiouf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enzo Rucci .

Editor information

Editors and Affiliations

Universidad Nacional de La Plata, La Plata, Argentina
Armando Eduardo De Giusti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rucci, E., De Giusti, A., Naiouf, M. (2018). Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study. In: De Giusti, A. (eds) Computer Science – CACIC 2017. CACIC 2017. Communications in Computer and Information Science, vol 790. Springer, Cham. https://doi.org/10.1007/978-3-319-75214-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-75214-3_5
Published: 26 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75213-6
Online ISBN: 978-3-319-75214-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics