Home
News
Products
Corporate
Contact
 
Wednesday, November 27, 2024

News
Industry News
Publications
CST News
Help/Support
Software
Tester FAQs
Industry News

Google to build TPU2 supercomputer for AI applications


Monday, December 18, 2017

So far, Google has only provided a few images of its second-generation Tensor Processing Unit, or TPU2, since announcing the AI chip in May at Google I/O.

The company has now revealed a little more about the processor, the souped-up successor to Google's first custom AI chip.

As spotted by The Register, Jeff Dean from the Google Brain team delivered a TPU2 presentation to scientists at last week's Neural Information Processing Systems (NIPS) conference in Long Beach, California.

Earlier this year, Dean said that the first TPU focused on efficiently running machine-learning models for tasks like language translation, AlphaGo Go strategy, and search and image recognition. The TPUs were good for inference, or already trained models.

However, the more intensive task of training these models was done separately on top-end GPUs and CPUs. Training time on this equipment still took days or weeks, blocking researchers from cracking bigger machine-learning problems.

TPU2 is intended to both train and run machine-learning models and cut out this GPU/CPU bottleneck.

A custom high-speed network in TPU2s, each of which delivers 180 teraflops of floating-point calculations, means they can be coupled together to become TPU Pod supercomputers. The TPU Pods are only available through Google Computer Engine as 'Cloud TPUs' that can be programmed with TensorFlow.

Dean's NIPS presentation offers more details on the design of the TPU Pods, the TPU2, and the TPU2 chips.

Each TPU Pod will consist of 64 TPU2s, delivering a massive 11.5 petaflops with four terabytes of high-bandwidth memory.

Meanwhile, each TPU2 consists of four TPU chips, offering 180 teraflops of computation, 64GB of high-bandwidth memory, and 2,400GB/s memory bandwidth.

Down to the TPU2 chips themselves, these feature two cores with 8GB of high-bandwidth memory apiece to give 16GB memory per chip. Each one has a 600GB/s memory bandwidth and delivers 45 teraflops of calculations.

As Dean notes, TPU1 was great for inference but the next breakthroughs in machine learning will the power of its TPU2-based TPU Pods. He offered 1,000 free TPUs to top researchers who've made it to Google's selective Tensor Research Cloud.

By: DocMemory
Copyright © 2023 CST, Inc. All Rights Reserved

CST Inc. Memory Tester DDR Tester
Copyright © 1994 - 2023 CST, Inc. All Rights Reserved