Maximizing Server Efficiency with Machine-Learning Accelerators
Deep convolutional neural networks (CNNs) are rapidly becoming the dominant approach to computer vision and a major component of many other pervasive machine learning tasks, such as speech recognition, natural language processing, and fraud detection. As a result, accelerators for efficiently evaluating DNNs are rapidly growing in popularity. Our work in this area focuses on two key challenges: minimizing the off-chip data transfer and maximizing the utilization of the computation units. In this talk, I will present an overview of my research work on understanding and impr