ABSTRACT
We introduce a method of partitioning for massively-parallel hardware accelerated functional verification. Our approach augments classical hypergraph partitioning to model temporal dependencies that maximize parallelization within the instruction memories of the machine. Simulation depth is further reduced by optimizing path criticality and cut directionality. Our techniques are demonstrated on an industrial accelerator containing 262,144 parallel processors, and benchmarked across designs containing up to 200 million gates.
- C. J. Alpert, J.-H. Huang, and A. B. Kahng. Multilevel circuit partitioning. In Proceedings of DAC 1997, pages 530--533, 1997. Google ScholarDigital Library
- C. J. Alpert and A. B. Kahng. Recent directions in netlist partitioning: A survey. Integration, 19:1--81, 1995. Google ScholarDigital Library
- T. Blank. A survey of hardware accelerators used in computer-aided design. IEEE Design and Test of Computers, 1(3):21--39, 1984.Google ScholarDigital Library
- A. E. Caldwell, A. B. Kahng, A. A. Kennings, and I. L. Markov. Hypergraph partitioning for VLSI CAD: Methodology for heuristic development, experimentation and reporting. In Proceedings of DAC 1999, pages 349--354, 1999. Google ScholarDigital Library
- R. D. Chamberlain. Parallel logic simulation of VLSI systems. In Proceedings of DAC 1995, pages 139--143, 1995. Google ScholarDigital Library
- J. A. Darringer, E. E. Davidson, D. J. Hathaway, B. Koenemann, M. A. Lavin, J. K. Morrell, K. Rahmat, W. Roesner, E. C. Schanzenbach, G. Tellez, and L. Trevillyan. EDA in IBM: past, present, and future. IEEE Trans. on CAD, 19(12):1476--1497, 2000. Google ScholarDigital Library
- C. M. Fiduccia and R. M. Mattheyses. A linear time heuristic for improving network partitions. In Proceedings of DAC 1982, pages 175--181, 1982. Google ScholarDigital Library
- F. M. Johannes. Partitioning of VLSI circuits and systems. In Proceedings of DAC 1996, pages 83--87, 1996. Google ScholarDigital Library
- A. B. Kahng, J. Lienig, I. L. Markov, and J. Hu. VLSI Physical Design: from Graph Partitioning to Timing Closure. Springer, 2010. Google ScholarDigital Library
- G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel hypergraph partitioning: Application in VLSI domain. In Proceedings of DAC 1997, pages 526--529, 1997. Google ScholarDigital Library
- B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 49(2):291--307, 1970.Google ScholarCross Ref
- Y. Levendel, P. Menon, and S. Patel. Special purpose computer for logic simulation using distributed processing. Bell System Technical Journal, 61(10):2873--2909, 1982.Google ScholarCross Ref
- M. Marek-Sadowska and S.-P. Lin. Timing driven placement. In Proceedings of ICCAD 1989, pages 94--97, 1989.Google ScholarCross Ref
- D. McGrath. Cadence teams with Wind River, rolls verification platform. EE Times, April 28th, 2010, April 2010.Google Scholar
- M. D. Moffitt and G. E. Günther. Scalable scheduling for hardware-accelerated functional verification. In Proceedings of ICAPS 2011 (to appear), 2011.Google Scholar
- G. Moretti. Mentor quadruples Veloce hardware emulation capacity. EE Times, December 16th, 2008, December 2008.Google Scholar
- D. A. Papa and I. L. Markov. Hypergraph partitioning and clustering. In Approximation Algorithms and Metaheuristics, 2007.Google ScholarCross Ref
- S. Patil, P. Banerjee, and C. Polychronopoulos. Efficient circuit partitioning algorithms for parallel logic simulation. In Proceedings of the 1989 Conference on Supercomputing, pages 361--370, 1989. Google ScholarDigital Library
- J. A. Roy and I. L. Markov. Partitioning-driven techniques for VLSI placement. In C. Alpert, D. Mehta, and S. Sapatnekar, editors, Handbook of Algorithms for VLSI Physical Design Automation, 2008.Google Scholar
- S. Sapatnekar. Timing. Springer-Verlag, New York, NY, USA, 2004. Google ScholarDigital Library
- K.-D. Schubert. POWER7 - verification challenge of a multi-core processor. In Proceedings of ICCAD 2009, pages 809--812, 2009. Google ScholarDigital Library
- R. J. Smith. Fundamentals of parallel logic simulation. In Proceedings of DAC 1986, pages 2--12, 1986. Google ScholarDigital Library
- S. P. Smith, B. Underwood, and M. R. Mercer. An analysis of several approaches to circuit partitioning for parallel logic simulation. In Proceedings of ICCD 1987, pages 664--667, 1987.Google Scholar
- L. Soulé and T. Blank. Parallel logic simulation on general purpose machines. In Proceedings of DAC 1988, pages 166--171, 1988. Google ScholarDigital Library
- C. Sporrer and H. Bauer. Corolla partitioning for distributed logic simulation of VLSI-circuits. In Proceedings of PADS 1993, pages 85--92, 1993. Google ScholarDigital Library
- B. Wile, J. Goss, and W. Roesner. Comprehensive Functional Verification: The Complete Industry Cycle (Systems on Silicon). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005. Google ScholarDigital Library
- L. Zhu, G. Chen, B. K. Szymanski, C. Tropper, and T. Zhang. Parallel logic simulation of million-gate VLSI circuits. In Proceedings of MASCOTS 2005, pages 521--524, 2005. Google ScholarDigital Library
Index Terms
- Robust partitioning for hardware-accelerated functional verification
Recommendations
Hardware-accelerated generation of 3D diffusion-limited aggregation structures
The diffusion and aggregation of particles in a medium can result in complex geometric forms with an artistic interpretation, yet these aggregates can represent many natural processes as well. Although the method is quite simple, it takes many particles ...
Implementations of hardware acceleration for MD4-family algorithms based on GPU
ASID'09: Proceedings of the 3rd international conference on Anti-Counterfeiting, security, and identification in communicationThe MD4-family algorithms have been widely applied in cryptographic field. Nowadays, it is discovered that MD4- family algorithms are also suitable for random number generators. Since the MD4-family algorithms are computing intensive, they can be ...
A Nested Partitioning Algorithm for Adaptive Meshes on Heterogeneous Clusters
ICS '15: Proceedings of the 29th ACM on International Conference on SupercomputingIn the era of the accelerator, load balancing strategies that are well-understood for traditional homogeneous supercomputers must be re-worked in order to address the problem of distributing work across heterogeneous hardware such that neither the CPU ...
Comments