ABSTRACT
GPU-aided servers are widely used to accelerate the increasing computing workload on edge network. However, massive computing requests can cause delays in transmission and processing at edge servers, leading to long response time and high system energy consumption. Moreover, in distributed scenarios, GPUs from different vendors can hardly communicate directly through the network. As a rare open-source GPU architecture, Vortex has been emerged to enable an FPGA accelerator to serve as a PCIe-based soft GPU. On this basis, Vortex GPUs in our edge acceleration architecture further integrate the low-latency RDMA function. Compared with the original Vortex GPU-aided servers cluster, our preliminary study shows that our Vortex GPU cluster is fast, scalable, and energy-efficient.
- Ching-Hsiang Chu, Xiaoyi Lu, Ammar A. Awan, Hari Subramoni, Bracy Elton, and Dhabaleswar K. Panda. 2019. Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. IEEE Transactions on Parallel and Distributed Systems 30, 3 (2019), 575--588. https://doi.org/10.1109/TPDS.2018.2867222Google ScholarCross Ref
- Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, and Kim Hyesoon. 2021. Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (Virtual Event, Greece) (MICRO '21). Association for Computing Machinery, New York, NY, USA, 754--766. https://doi.org/10.1145/3466752.3480128Google ScholarDigital Library
- Hong Zhang, Yupeng Tang, Anurag Khandelwal, and Ion Stoica. 2023. SHEPHERD: Serving DNNs in the Wild. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 787--808. https://www.usenix.org/conference/nsdi23/presentation/zhang-hongGoogle Scholar
Index Terms
- Poster: A Fast, Scalable, and Energy-efficient Edge Acceleration Architecture based on GPU Cluster
Recommendations
A fast, scalable, and energy-efficient edge acceleration architecture based on FPGA cluster
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and TechnologiesFPGA-based acceleration has been emerged to avoid the cloud computing overload problem by accelerating the compute-intensive workload on edge networks. Though existing studies for FPGA-based edge acceleration have focused on optimizing the computing ...
The development of Mellanox/NVIDIA GPUDirect over InfiniBand--a new model for GPU to GPU communications
The usage and adoption of General Purpose GPUs (GPGPU) in HPC systems is increasing due to the unparalleled performance advantage of the GPUs and the ability to fulfill the ever-increasing demands for floating points operations. While the GPU can ...
InfiniBand Verbs on GPU
Due to their massive parallelism and high performance per Watt, GPUs have gained high popularity in high-performance computing and are a strong candidate for future exascale systems. But communication and data transfer in GPU-accelerated systems remain ...
Comments