Wednesday, April 10, 2024
Unless someone has been hiding under the proverbial rock since before the pandemic, everyone has at least heard of AI. Over the last 18 months, since the launch of ChatGPT in late 2022, AI has become a topic of conversation not only from Main Street to Wall Street, but from Capitol Hill to the ski slopes of Davos at the World Economic Forum’s annual meeting. Even with the disparate natures of these conversations and the different levels of expertise of those discussing AI, they all have one thing in common—they are all trying to understand AI, its impact and its implications.
There appears to be an understanding—or maybe a hope—that if AI is at least mentioned in conjunction with something else, that something else will immediately get more attention. While this might have been the case in 2023, it is no longer the case now. What appears to not be as well understood is that there are different kinds of AI, and some of them have been around a lot longer than ChatGPT.
Additionally, these different kinds of AI have different implications in terms of supporting hardware and software, as well as use cases. With a greater understanding of these nuances comes a greater sophistication and a realization that just simply mentioning “AI” is no longer adequate. The conversation must involve what problem is being addressed, how AI is being used to address that problem and for whom.
Traditional vs. generative AI
Before delving into the maturing nature of the AI ecosystem and the solutions that are starting to be brought to bear, it is worth taking a small step back and level setting on two of the primary types of AI: traditional AI and generative AI. Given that most people know AI primarily through the hype generated by ChatGPT, their understanding of AI revolves around what is better described as “generative AI”. There is a lesser known—but more prevalent—form of AI now often referred to as “traditional AI.”
The primary characteristic that defines generative AI versus traditional AI is a model’s ability to create novel content based on prompted inputs for the former, as opposed to a known outcome based on specific inputs for the latter. While both types of AI are predictive in nature, generative AI creates new patterns of data or tokens given the most likely occurrence based on the data on which it was trained. Traditional AI, on the other hand, recognizes existing patterns and acts upon them based on pre-determined rules and actions.
Essentially, while the latter is all about pattern recognition, the former is about pattern creation. A simple example was demonstrated by Jensen Huang at GTC 2024: traditional AI started to take off with the AlexNet neural network model in 2012. It could process a picture of a cat and then identify that the picture was of a cat. With generative AI, you input a text prompt “cat” and the neural net will generate a picture of a cat.
Another point of differentiation is the amount of resources required for both training and inference of each type of AI. On the training side, given the size of the models and the amount of data required to adequately train generative AI models, typically a data center’s worth of CPUs and GPUs in the tens of thousands are required. In contrast, typical traditional AI training might require a single server’s worth of high-end CPUs and maybe a handful of GPUs.
Similarly for inferencing, generative AI might utilize the same data center scale of processing resources or, at best, when optimized for edge applications, a heterogenous compute architecture which typically consists of CPUs, GPUs, neural processing units (NPUs) and other accelerators providing multiple tens of TOPS. For these edge applications running on-device where generative AI models are in the range of 7 billion parameters or less, this is estimated to be at least about 30-40 TOPS just for the NPU. On the other hand, traditional AI inferencing can typically be performed with microcontroller-level resources or, at worst, a microcontroller with a small AI accelerator.
Granted, the scale of these resource requirements for the different types of AI are all dependent on model sizes, the amount of data required to adequately train the models and how quickly the training or inferencing needs to be performed. For example, there are some traditional AI models like those used for genome sequencing that require significant amounts of resources and might rival generative AI requirements. However, in general and for the most widely used models, these resource comparisons are valid and applicable.
What is it good for? Potentially everything.
As the ecosystem for AI solutions continues to mature, it is becoming clear that it is no longer enough to just mention AI. A more developed strategy, positioning and demonstration of the solutions are required to establish a bona-fide claim to participate as a legitimate competitor. Potential customers have seen the technology showcases of creating images of puppies eating ice cream on the beach. That’s great. But they are now asking, “How can it really provide value by helping me personally or by solving my enterprise challenges?”
The great thing about the AI ecosystem is that it is just that—an ecosystem of many diverse companies all trying to answer these questions. Qualcomm and IBM are two companies that were at this year’s Mobile World Congress (MWC) that are worth noting in this context, given how they are using both types of AI and applying them to consumers/prosumers for the former and enterprises specifically for the latter.
Additionally, not only do they have their own solutions, but they also both have development environments to help developers create AI-based applications that are critical for the developer ecosystem to do what they do best. Just like with the app store and software development kits that were required at the onset of the smartphone era, these development environments will allow the developer ecosystem to innovate and create AI-based apps for use cases that have not even been thought of yet.
To help answer the question, “What is AI good for?”, at the show, Qualcomm demonstrated a handful of real-world applications bringing AI to bear. On the traditional AI front, their latest Snapdragon X80 5G modem-RF platform uses AI to dynamically optimize 5G. It accomplishes this by providing the modem’s AI with contextual awareness regarding what application or workload is being utilized by the user, as well as the current RF environment in which the device is operating.
Informed with this awareness, the AI then makes real-time decisions on key optimization factors like transmit power, antenna configuration and modulation schemes—among others—to dynamically optimize the 5G connection and provide the best performance at the lowest power for what the application requires, and the RF environment allows.
On the generative AI front, Qualcomm’s solutions highlighted how generative AI is enabling a new class of AI smartphones and future AI PCs. Given how much user-generated images and videos are created using smartphones, many of the solutions centered around image and video manipulation, as well as privacy and personalization, can be achieved by having the generative AI model running on device. Additionally, they demonstrated how multimodal generative AI models facilitate a more natural way of interacting with these models, allowing prompts to include not only text but voice, audio and image inputs.
For example, an image of raw ingredients can be submitted with a prompt asking for a recipe that includes those ingredients. The multimodal model will then take the text or verbal prompt along with identifying the ingredients in the picture to output a recipe using those ingredients.
The first of these solutions are hitting the market now through first-party applications developed by the smartphone OEMs themselves. This makes sense as the OEMs have been able to work with the chipset supplier—in this case Qualcomm—to best make use of the available resources like the NPU and optimize these generative AI-based applications for performance and power consumption. These first-party applications will serve as an appetizer, whetting the appetites of smartphone users and helping them understand what on-device generative AI can do. Ultimately, TIRIAS Research believes this will lead to the next wave of adoption driven by third-party generative AI-based application developers.
This is where Qualcomm’s announcement of their AI Hub will help. The AI Hub aims to allow developers to take full advantage of Qualcomm’s heterogeneous computing architecture in their Snapdragon chipsets, which consist of CPUs, GPUs and NPUs. One of the trickiest aspects of developing a third-party application that utilizes generative AI models is how to best optimize the workloads to run on the best processing resource to optimize performance and power consumption. AI Hub provides developers the ability to see how the application performs if they run their app on the CPU versus GPU versus NPU and optimize from there. Furthermore, developers can run their applications on real devices using what Qualcomm is calling their “device farm” over the cloud. The best part for developers? They can do all of this for free according to Qualcomm.
While Qualcomm was focused on the end devices that consumers and prosumers use, IBM highlighted solutions for enterprises looking to take advantage of AI through their watsonx platform. At MWC, one of the many applications they highlighted was their watsonx call center assistant, which utilizes both traditional AI and generative AI depending on what the assistant is asked to do. Certain tasks like answering frequently asked questions with well-defined answers can be accomplished using traditional AI, while other tasks like asking the call center assistant to summarize the article that it had referred the caller to would need generative AI capabilities. Taking this type of hybrid approach helps enterprises optimize compute resource utilization, which ultimately leads to better cost management.
As enterprises start to incorporate AI into their workflows and processes, it is clear they cannot use generic models like ChatGPT given the need for their AI-based applications to access and utilize corporate and sensitive information. As such, most enterprises will need to either develop their own models or customize existing models with their own data. To help with this, the watsonx platform helps enterprises manage their data for use in AI training and inference with watson.data, create or fine tune their own applications with watson.ai, and do so responsibly with watson.governance.
The next step for AI
We are just now entering into the AI Era and are still in the early stages. While 2023 was the year that captured everyone’s imagination around AI, 2024 is going to be about value creation and continued evolution. This year will show us what AI can do and prompt us to ask, “If it can do that, wouldn’t it be great if it can do…?”
If previous technological breakthroughs are any indication, once the global economy starts asking that question, the door to a brave new world is about to open with uses for AI that are yet to be imagined.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|