Wednesday, September 4, 2024
In the rush to develop next-generation AI, Elon Musk has assembled a new supercomputer running 100,000 Nvidia GPUs in four months.
On Monday, Musk tweeted that the supercomputer, based in Memphis, Tennessee, is now online after the city initially announced the project back in June.
“From start to finish, it was done in 122 days,” Musk tweeted, which also claims that “Colossus is the most powerful AI training system in the world.”
The supercomputer was built using 100,000 Nvidia H100s, a GPU that tech companies worldwide have been scrambling to buy to train new AI models. The GPU usually costs around $30,000, suggesting that Musk spent at least $3 billion to build the new supercomputer, a facility that will also require significant electricity and cooling.
The Colossus supercomputer will expand over time. In his tweet, Musk said the facility will “double in size to 200k” GPUs in a few months when an additional 50,000 Nvidia H200 GPUs, which feature upgraded memory, are added to the fold.
Musk built the facility for xAI, his latest startup, which is focused on creating generative AI technologies, including the controversial pro-free speech chatbot Grok. By assembling hundreds of thousands of GPUs together, xAI is aiming to accelerate training for Grok and other AI projects to unlock new capabilities and improvements.
It’s doubtful Musk’s supercomputer is the most powerful AI training system in the world since companies including Meta, Microsoft, and OpenAI are also buying hundreds of thousands of Nvidia GPUs for their own AI training projects. Still, Colossus highlights how fast the industry is building new facilities to train AI. In another tweet, Musk indicated that xAI had received other projections that expected it would take 12 to 18 months to bring the supercomputer online.
Despite the achievement, the arrival of Colossus is also raising concerns that the supercomputer will take a toll on Memphis’s environment, water supplies, and the electricity grid. One local group has also called on the city to investigate whether the facility’s turbines risk generating air pollution. However, city officials say xAI has pledged to help improve the local infrastructure to support the supercomputer.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|