
Abstract
Computing accelerators, such as GPUs, enhance performance for tasks like high-resolution gaming and Al computations, while specialized accelerators in mobile devices prioritize energy efficiency. Designing these accelerators is complex, as it involves identifying computational tasks best suited for acceleration and optimizing performance and efficiency.
Hardware design relies on cycle-accurate RTL simulation, but conventional sequential simulation is a bottleneck for increasingly complex systems. Parallel RTL simulation, which runs on multiple cores, offers a potential solution, though current simulators are limited by high synchronization and communication costs. This work introduces two solutions.
The first, Manticore, is an accelerator for RTL simulation that uses static scheduling to reduce overhead, making fine-grained parallelism practical. Our 225-core FPGA prototype runs at 475 MHz and outperforms current RTL simulators. The second solution, Parendi, leverages the Graphcore computer to run RTL simulations across 5,888 cores, demonstrating that massively parallel RTL simulation is effective with the right hardware support.
Biography
James Larus is a Professor Emeritus and former Dean of the School of Computer and Communication Sciences (IC) at EPFL. He is currently the Editor-in-Chief of the Communications of the ACM (CACM). Before joining IC in 2013, Lars was a researcher, manager, and director in Microsoft Research for over 16 years and an assistant and associate professor in the Computer Sciences Department at the University of Wisconsin, Madison.
Larus actively contributes to numerous research communities with over 100 papers (including 14 best and most influential paper awards) and over 40 US patents. He received a National Science Foundation Young Investigator award in 1993 and became an ACM Fellow in 2006.
Lars received his PhD in Computer Science from UC Berkeley in 1989 and an AB in Applied Mathematics from Harvard in 1980.