NASA Logo, National Aeronautics and Space Administration

Cliff Young: "TensorFlow Processing Unit: Hardware for Fast Neural Net Inferencing".

Abstract: With the ending of Moore's Law, many computer architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. The Tensor Processing Unit (TPU), deployed in Google datacenters since 2015, is a custom chip that accelerates deep neural networks (DNNs). We compare the TPU to contemporary server-class CPUs and GPUs deployed in the same datacenters. Our benchmark workload, written using the high-level TensorFlow framework, uses production DNN applications that represent 95% of our datacenters’ DNN demand. The TPU is an order of magnitude faster than contemporary CPUs and GPUs and its relative performance per Watt is even larger. The TPU’s deterministic execution model turns out to be a better match to the response-time requirement of our DNN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, …) that help average throughput more than guaranteed latency. The lack of such features also helps explain why despite having myriad arithmetic units and a big memory, the TPU is relatively small and low power.

Bio: Cliff Young is a member of the Google Brain team, whose mission is to develop deep learning technologies and deploy them throughout Google. He is one of the designers of Google’s Tensor Processing Unit (TPU), which is used in production applications including Search, Maps, Photos, and Translate. TPUs also powered AlphaGo’s historic 4-1 victory over Go champion Lee Sedol. Before joining Google, Cliff worked at D. E. Shaw Research, building special-purpose supercomputers for molecular dynamics, and at Bell Labs.

display this

First Gov logo
NASA Logo -