skip to main content
10.1145/3241793.3241801acmotherconferencesArticle/Chapter ViewAbstractPublication PagesheartConference Proceedingsconference-collections
research-article

Swap Based Merge Network for High Performance Sorting Accelerators

Published:20 June 2018Publication History

ABSTRACT

A hardware module called merge network is the key module for constructing FPGA-based sorting accelerators. Therefore, we propose a novel merge network based on compare and swap operations for high performance sorting accelerators. Our proposal is based on the state-of-the-art merge network and it tries to mitigate its drawback that the maximum wiring delay and the maximum fanout increase when the number of records output per cycle is increased.

We implement some merge networks adopting the proposal on a Virtex-7 FPGA. The evaluation results show that the maximum fanout of the proposal is constant, and the maximum wiring delay of the proposal is almost constant. Because of these desirable properties, the proposal of the largest configuration achieves 1.43x higher throughput than the state-of-the-art merge network.

References

  1. Jared Casper and Kunle Olukotun. 2014. Hardware Acceleration of Database Operations. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-programmable Gate Arrays (FPGA '14). ACM, New York, NY, USA, 151--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Minsik Cho, Daniel Brand, Rajesh Bordawekar, Ulrich Finkler, Vincent Kulandaisamy, and Ruchir Puri. 2015. PARADIS: An Efficient Parallel Algorithm for In place Radix Sort. Proc. VLDB Endow. 8, 12 (Aug. 2015), 1518--1529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrew Davidson, David Tarjan, Michael Garland, and John D. Owens. 2012. Efficient parallel merge sort for fixed and variable length keys. In 2012 Innovative Parallel Computing (InPar). IEEE, 1--9.Google ScholarGoogle Scholar
  4. Hiroshi Inoue and Kenjiro Taura. 2015. SIMD- and Cache-friendly Algorithm for Sorting an Array of Structures. Proc. VLDB Endow. 8, 11 (July 2015), 1274--1285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dirk Koch and Jim Torresen. 2011. FPGASort: A High Performance Sorting Architecture Exploiting Run-time Reconfiguration on Fpgas for Large Problem Sorting. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '11). ACM, New York, NY, USA, 45--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Susumu Mashimo, Thiem Van Chu, and Kenji Kise. 2017. High-Performance Hardware Merge Sorter. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 1--8.Google ScholarGoogle Scholar
  7. Duane Merrill and Andrew Grimshaw. 2011. High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing. Parallel Processing Letters (PPL) 21, 02 (2011), 245--272.Google ScholarGoogle ScholarCross RefCross Ref
  8. Makoto Saitoh, Elsayed A. Elsayed, Thiem Van Chu, Susumu Mashimo, and Kenji Kise. 2018. High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 197--204.Google ScholarGoogle ScholarCross RefCross Ref
  9. Wei Song, Dirk Koch, Mikel Luján, and Jim Garside. 2016. Parallel Hardware Merge Sorter. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 95--102.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    HEART '18: Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies
    June 2018
    125 pages
    ISBN:9781450365420
    DOI:10.1145/3241793

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 20 June 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate22of50submissions,44%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader