short-paper

Graph Representation Matters in Device Placement

Authors:
Milko Mitropolitsky

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden
View Profile

,
Zainab Abbas

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden
View Profile

,
Amir H. Payberah

KTH Royal Institute of Technology, Stockholm, Sweden

KTH Royal Institute of Technology, Stockholm, Sweden
View Profile

DIDL'20: Proceedings of the Workshop on Distributed Infrastructures for Deep LearningDecember 2020Pages 1–6https://doi.org/10.1145/3429882.3430104

Published:04 January 2021Publication History

DIDL'20: Proceedings of the Workshop on Distributed Infrastructures for Deep Learning

Pages 1–6

ABSTRACT

Modern Neural Network (NN) models require more data and parameters to perform ever more complicated tasks. One approach to train a massive NN is to distribute it across multiple devices. This approach raises a problem known as the device placement problem. Most of the state-of-the-art solutions that tackle this problem leverage graph embedding techniques. In this work, we assess the impact of different graph embedding techniques on the quality of device placement, measured by (i) the execution time of partitioned NN models, and (ii) the computation time of the graph embedding technique. In particular, we expand Placeto, a state-of-the-art device placement solution, and evaluate the impact of two graph embedding techniques, GraphSAGE and P-GNN, compared to the original Placeto graph embedding model, Placeto-GNN. In terms of the execution time improvement, we achieve an increase of 23.967% when using P-GNN compared to Placeto-GNN, while GraphSAGE produces 1.165% better results than Placeto-GNN. Regarding computation time, GraphSAGE has a gain of 11.569% compared to Placeto-GNN, whereas P-GNN is 6.95% slower than it.

References

R. Addanki et al. 2019. Placeto: Learning generalizable device placement algorithms for distributed machine learning. arXiv:1906.08879 (2019).Google Scholar
J. Dean et al. 2012. Large scale distributed deep networks. In Advances in neural information processing systems. 1223--1231. Google ScholarDigital Library
W. Hamilton et al. 2017. Inductive representation learning on large graphs. (2017), 1024--1034. Google ScholarDigital Library
P. Hieu et al. 2018. Efficient neural architecture search via parameter sharing. arXiv:1802.03268 (2018).Google Scholar
S. Hochreiter et al. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Y. Huang et al. 2018. Flexps: Flexible parallelism control in parameter server architecture. VLDB Endowment 11, 5 (2018), 566--579. Google ScholarDigital Library
R. Mayer et al. 2017. The tensorflow partitioning and scheduling problem: it's the critical path!. In DIDL. 1--6. Google ScholarDigital Library
A. Mirhoseini et al. 2017. Device placement optimization with reinforcement learning. arXiv:1706.04972 (2017). Google ScholarDigital Library
A. Mirhoseini et al. 2018. A hierarchical model for device placement. (2018).Google Scholar
A. Nazi et al. 2019. Gap: Generalizable approximate graph partitioning framework. arXiv:1903.00614 (2019).Google Scholar
J Schulman et al. 2017. Proximal policy optimization algorithms. arXiv:1707.06347 (2017).Google Scholar
A. Sergeev et al. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv:1802.05799 (2018).Google Scholar
A Vaswani et al. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008. Google ScholarDigital Library
Y. Wu et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144 (2016).Google Scholar
J You et al. 2019. Position-aware graph neural networks. arXiv:1906.04817 (2019).Google Scholar
G. Yuanxiang et al. 2018. Spotlight: Optimizing device placement for training deep neural networks. In ICML. 1676--1684.Google Scholar
Y. Zhou et al. 2019. GDP: Generalized Device Placement for Dataflow Graphs. arXiv:1910.01578 (2019).Google Scholar

Index Terms

Graph Representation Matters in Device Placement
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement
Distributed Applications and Interoperable Systems
Abstract
Modern neural networks require long training to reach decent performance on massive datasets. One common approach to speed up training is model parallelization, where large neural networks are split across multiple devices. However, different ...
Read More
A plane graph representation of triconnected graphs

Given a graph G=(V,E), a set S={s1,s2, ,sk} of k vertices of V, and k natural numbers n1,n2, ,nk such that i=1kni=|V|, the k-partition problem is to find a partition V1,V2, ,Vk of the vertex set V such that |Vi|=ni, si Vi, and Vi induces a connected ...
Read More
An Embedding Graph-based Model for Software Watermarking
IIH-MSP '12: Proceedings of the 2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing

In a software watermarking environment, several graph theoretic watermark methods encode the watermark values as graph structures and embed them in application programs. In this paper we first present an efficient codec system for encoding a watermark ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

DIDL'20: Proceedings of the Workshop on Distributed Infrastructures for Deep Learning
December 2020
17 pages
ISBN:9781450382069
DOI:10.1145/3429882

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 January 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Distributed Deep Learning
Graph Embedding
Model Parallelization
Qualifiers
- short-paper
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 123
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Graph Representation Matters in Device Placement

DIDL'20: Proceedings of the Workshop on Distributed Infrastructures for Deep Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement

A plane graph representation of triconnected graphs

An Embedding Graph-based Model for Software Watermarking

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Graph Representation Matters in Device Placement

DIDL'20: Proceedings of the Workshop on Distributed Infrastructures for Deep Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement

A plane graph representation of triconnected graphs

An Embedding Graph-based Model for Software Watermarking

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media