Matt Sinclair
Assistant Professor in the Computer Sciences Department at the University of Wisconsin-Madison
UCSB, Henley Hall 1010
Matt Sinclair Lecture

Abstract

In recent years, to reach performance goals, modern computing systems are increasingly turning to using large numbers of compute accelerators, which offer greater power efficiency and thus enable higher performance within a constrained power budget. However, using accelerators increases heterogeneity at multiple levels, including the architecture, resource allocation, competing user needs, and manufacturing variability. Accordingly, current and future systems need to efficiently handle many simultaneous jobs while balancing PM and multiple levels of heterogeneity. In recent work, we have demonstrated the extent of this variability in modern accelerator-rich systems (SC'22) and shown how to embrace variability in cluster-level job schedulers (SC'24). This work significantly improves the efficiency of modern systems for a range of ML workloads. However, scheduling jobs at the software and runtime layers is limited in its ability to quickly, dynamically change policies as cluster conditions evolve. A major limiter to further improving efficiency is the lack of standards for exposing power information in modern accelerators. Thus, for future systems we propose to build on the insights generated by our optimizations for current systems, and apply co-design that makes the hardware, software, and runtime layers aware of the variance in the systems.

Biography

Matt Sinclair is an Assistant Professor in the Computer Sciences Department at the University of Wisconsin-Madison. He is also an Affiliate Faculty in the ECE Department and Teaching Academy at UW-Madison. His research primarily focuses on how to design, program, and optimize future heterogeneous systems. He also designs the tools for future heterogeneous systems, including serving on the gem5 Project Management Committee and the MLCommons HPC, Power, and Science Working Groups. He is a recipient of the DOE Early Career and NSF CAREER awards, and his work has been funded by AMD, the DOE, Google, NSF, and SRC. His research has also been recognized several times, including an ACM Doctoral Dissertation Award nomination, a Qualcomm Innovation Fellowship, the David J. Kuck Outstanding PhD Thesis Award, and an ACM SIGARCH - IEEE Computer Society TCCA Outstanding Dissertation Award Honorable Mention. He is also the current steward for the ISCA Hall of Fame.