skip to main content
10.1145/3109984.3110005acmconferencesArticle/Chapter ViewAbstractPublication PagessbcciConference Proceedingsconference-collections
research-article

Exploiting cache locality to speedup register clustering

Authors Info & Claims
Published:28 August 2017Publication History

ABSTRACT

Physical design tools must handle huge amounts of data in order to solve problems for circuits with millions of cells. Traditionally, Electronic Design Automation tools are implemented using Object-Oriented Design. However, using this paradigm may lead to overly complex objects that result in waste of cache memory space. This memory wasting harms cache locality exploration and, consequently, degrades software runtime. This work proposes applying Data-Oriented Design on the register clustering problem. Differently from the traditional Object-Oriented design, the Data-Oriented Design programming model focus on how the data is organized in the memory. As consequence, this programming model may better explore cache spatial locality. In order to evaluate the impact of using the Data-Oriented Design programming model for register clustering, we implemented two software prototypes (a sequential and a parallel implementation) of the K-means clustering algorithm for each programming model. Experimental results showed that the sequential Data-Oriented Design implementation is on average 7.5% faster when compared to the Object-Oriented Design implementation, while its parallel version is 15% faster when compared to the Object-Oriented one.

References

  1. Wing-Kai Chow, Chak-Wa Pui, and Evangeline FY Young. 2016. Legalization algorithm for multiple-row height standard cell design. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Federal University of Santa Catarina Embedded Computing Lab. 2017. Ophidian: an Open Source Library for Physical Design Research and Teaching. https://github.com/eclufsc/ophidian. (2017).Google ScholarGoogle Scholar
  3. Guilherme Flach, Mateus Fogaça, Jucemar Monteiro, Marcelo Johann, and Ricardo Reis. 2017. Rsyn: An Extensible Physical Synthesis Framework. In Proceedings of the 2017 ACM on International Symposium on Physical Design. ACM, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tiago Fontana, Renan Netto, Vinicius Livramento, Chrystian Guth, Sheiny Almeida, Laércio Pilla, and José Luis Güntzel. 2017. How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library. In Proceedings of the 2017 ACM on International Symposium on Physical Design. ACM, 25--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Guthaus, G. Wilke, and Reis. 2013. Revisiting automated physical synthesis of high-performance clock networks. TODAES 18, 2 (2013), 31:1--31:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tsung-Wei Huang and Martin DF Wong. 2015. Opentimer: A high-performance timing analysis tool. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. IEEE Press, 895--902. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Silicon Integration Initiative. 2017. Open Access. http://www.si2.org/openaccess/. (2017).Google ScholarGoogle Scholar
  8. Jinwook Jung, Iris Hui-Ru Jiang, Gi-Joon Nam, Victor N Kravets, Laleh Behjat, and Yin-Lang Li. 2016. OpenDesign flow database: the infrastructure for VLSI design and design automation research. In Proceedings of the 35th International Conference on Computer-Aided Design. ACM, 42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andrew B Kahng, Hyein Lee, and Jiajia Li. 2014. Horizontal benchmark extension for improved assessment of physical CAD research. In Proceedings of the 24th edition of the great lakes symposium on VLSI. ACM, 27--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrew B Kahng, Jens Lienig, Igor L Markov, and Jin Hu. 2011. VLSI physical design: from graph partitioning to timing closure. Springer Science & Business Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Kim, J. Hu, J. Li, and N. Viswanathan. 2015. ICCAD-2015 CAD contest in incremental timing-driven placement and benchmark suite. In ICCAD. 921--926. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. University of Michigan. 2017. UMICH Physical Design Tools. https://www.src.org/library/publication/p013527/. (2017).Google ScholarGoogle Scholar
  13. OpenMP. 2017. The OpenMP API. http://openmp.org/. (2017).Google ScholarGoogle Scholar
  14. David Papa, Charles Alpert, Cliff Sze, Zhuo Li, Natarajan Viswanathan, Gi-Joon Nam, and Igor Markov. 2011. Physical synthesis with clock-network optimization for large systems on chips. IEEE Micro 31, 4 (2011), 51--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David A Patterson and John L Hennessy. 2013. Computer organization and design: the hardware/software interface. Newnes. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Shokri Z Selim and Mohamed A Ismail. 1984. K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. Pattern Analysis and Machine Intelligence, IEEE Transactions on 1 (1984), 81--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chao-Hung Wang, Yen-Yi Wu, Jianli Chen, Yao-Wen Chang, Sy-Yen Kuo, Wenxing Zhu, and Genghua Fan. 2017. An effective legalization algorithm for mixed-cell-height standard cells. In Design Automation Conference (ASP-DAC), 2017 22nd Asia and South Pacific. IEEE, 450--455.Google ScholarGoogle ScholarCross RefCross Ref
  18. Gang Wu, Yue Xu, Dean Wu, Manoj Ragupathy, Yu-yen Mo, and Chris Chu. 2016. Flip-flop clustering by weighted K-means algorithm. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C Yeh, G Wilke, Hongyu Chen, and others. 2006. Clock distribution architectures: A comparative study. In ISQED. 85--91. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting cache locality to speedup register clustering

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SBCCI '17: Proceedings of the 30th Symposium on Integrated Circuits and Systems Design: Chip on the Sands
          August 2017
          238 pages
          ISBN:9781450351065
          DOI:10.1145/3109984

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 August 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate133of347submissions,38%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader