Wednesday, June 5, 2024
In January 2024, leading private equity firm Blackstone announced it was building a $25 billion AI data empire. A few months later, OpenAI and Microsoft followed suit with a proposition to build Stargate, a $100 billion AI supercomputer that will launch the company to the forefront of the AI revolution.
Of course, this is not a surprise. With the rapid acceleration the AI sector has witnessed over the past few years, industry giants all over the world are in a frantic haste to get front-row seats. Experts already predict the global AI market will hit a massive $826.70bn in volume by 2030, with an annual growth rate of 28.46%.
The only problem?
GPUs.
Von Neumann’s architecture, the design model that most general computers operate on (composed of the CPU, Memory, I/O Devices, and System Bus), is inherently limited even though it offers simplicity and cross-system compatibility. The single System Bus of this architecture restricts the speed at which data can be transferred between memory and the CPU, thus making CPUs less than optimal for AI and machine learning purposes.
This is where the Graphics Processing Units (GPUs) come in. By incorporating parallelism as a processing technique, GPUs offer improved performance and independent instruction execution through their multi-cores. However, with the dawn of AI technology, the demand for GPUs has skyrocketed, straining supply chains and posing a severe bottleneck to the efforts of many researchers and startups. This is especially true since the world’s supply of GPUs comes from just one major producer: Nvidia.
While hyper-scalers like AWS, Google Cloud Platform, and others may be able to easily access A100s and H100s from Nvidia, what are other viable alternatives that can help firms, researchers, and startups latch on the AI train instead of being stuck indefinitely on the Nvidia waitlist?
Field Programmable Gate Arrays
FPGAs are reprogrammable, integrated circuits that can be configured to serve specific tasks and application needs. They offer flexibility, can be adapted to meet varying requirements, and are cost-effective. Since FPGAs are efficient at parallel processing, they are well-suited to AI/machine learning uses and possess distinctively low latency in real-life applications.
An interesting implementation of FPGAs can be seen in the Tesla D1 Dojo chip, which the company released in 2021 to train computer vision models for self-driving cars. A few drawbacks to FPGAs, however, include the high engineering expertise required to architect the hardware, which can translate into expensive initial acquisition costs.
AMD GPUs
In 2023, companies like Meta, Oracle, and Microsoft signaled their interest in AMD GPUs as a more cost-effective solution and a way to avoid a potential vendor lock-in with dominant Nvidia. AMD’s Instinct MI300 series, for example, is considered a viable alternative for scientific computing and AI uses. Its Graphics Core Next (GCN) architecture, which emphasizes modularity and support for open standards, plus its more affordable price point, make it a promising alternative to Nvidia GPUs.
Tensor Processing Units
TPUs are application-specific integrated circuits (ASICs) programmed to perform machine-learning tasks. A brainchild of Google, TPUs rely on a domain-specific architecture to run neural networks, such as tensor operations. They also have the advantage of energy efficiency and optimized performance, making them an affordable alternative for scaling and managing costs.
It should be noted, however, that the TPU ecosystem is still emerging, and the current availability is limited to the Google Cloud Platform.
Decentralized Marketplaces
Decentralized marketplaces are also trying to mitigate the constricted GPU supply train in their own way. By capitalizing on idle GPU resources from legacy data centers, academic institutions, and even individuals, these marketplaces provide researchers, startups, and other institutions with enough GPU resources to run their projects. Examples include Render Network, FluxEdge, Bittensor, and others.
Many of these marketplaces offer consumer-grade GPUs that can sufficiently handle the needs of small to medium AI/ML companies, thus reducing the pressure on high-end professional GPUs. Some marketplaces also provide additional options for clients who also want industrial-grade GPUs.
CPUs
CPUs are often considered the underdogs for AI purposes due to their limited throughput and the von Neumann bottleneck. However, there are ongoing efforts to figure out how to run more AI-efficient algorithms on CPUs. These include allocating specific workloads to the CPU, like simple NLP models and algorithms that perform complex statistical computations.
While this may not be a one-size-fits-all solution, it is perfect for algorithms that are hard to run in parallel, such as recurrent neural networks or recommender systems for training and inference.
Rounding Up
The scarcity of GPUs for AI purposes may not be going away anytime soon, but there is a bit of good news. The ongoing innovations in AI chip technology attest to an exciting future full of possibilities that will one day ensure the GPU problem fades into the background. A lot of potential remains to be harnessed in the AI sector, and we might just be standing on the precipices of the most significant technology revolution known to humanity.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|