Wednesday, March 12, 2025
Field-programmable gate arrays are commonly utilized in designs that require high adaptability due to rapidly and dynamically changing requirements, which would make the high cost of ASICs untenable. These designs typically involve relatively low-volume specialty applications, nascent markets where requirements are still maturing, nascent technologies where capabilities and standards are rapidly evolving, or, in the case of edge AI, a combination of all these factors.
Even with the general advantages that FPGAs have versus ASICs, not all FPGAs are designed equally. Various design considerations and tradeoffs can help a given FPGA solution be more optimized for a specific use case or application. Like most things, the more defined the target use case is, the more optimized the solution can be in terms of performance, power efficiency and form factor.
When designing a solution using FPGAs, many factors come into play. Considerations like the development environment, design tools and support community are all important aspects. However, it all starts with the hardware design. In the world of small FPGA platforms—typically FPGAs with a logic density of about 200k system logic cells (SLCs) or less, especially when targeted at developing edge AI applications—this can be boiled down to how well the design performs in terms of computational power, boot times, power consumption, form factor and security.
An example of a recently released small FPGA platform targeting embedded edge AI applications is Lattice Semiconductor’s Nexus 2 platform. Given the increased computational requirements of edge AI applications, Nexus 2 not only features enhanced processing and memory subsystems but also increased interconnect capabilities to support the higher data bandwidths required by AI applications, according to the company.
From the previous generation, SLCs increased from a maximum of 130k SLCs to 220k, DSPs were nearly quadrupled from up to 156 to 520, and SERDES bandwidth and LPDDR4 data rates were nearly doubled.
Boot times are also critical in embedded designs, especially those used in edge AI applications. Boot times depend on the speed of the flash memory, flash clock frequency, interface type and configuration data size. In the Nexus 2, Lattice designed for this by increasing the flash clock frequency from 133MHz to 160 MHz. It also improved the flash interface with DDR xSPI, which is 4× faster than QSPI used in the previous generation, as well as most of the current competition. Further, Nexus 2 achieves a smaller configuration data size by utilizing a 4-input lookup table (LUT4) architecture.
Another critical characteristic in small FPGAs going into edge AI devices is its power consumption. This is typically driven by static power consumption based on the number of configuration bits and area utilization (better area efficiency leads to reduced leakage). Dynamic power consumption is also a factor and is driven by a number of logic levels required to implement a function.
The last factor in power consumption is trade-offs between performance and power efficiency by optimizing architecture (LUT4 vs. LUT6 vs LUT8) against intended workloads/applications. Nexus 2’s focus on small FPGA use cases enabled Lattice to take advantage of a LUT4 architecture, which is better for static power and area efficiency while enabling sufficient performance for the targeted use cases.
While LUT6 is theoretically better than LUT4 in terms of dynamic power, some LUT6 designs also require a larger architecture overhead to facilitate increased complexity associated with that architecture. This can offset the theoretical advantage in dynamic power. When selecting an architecture, this breakpoint must be assessed to ensure that the LUT6 performance advantages more than offset the increased overhead complexity.
As edge AI devices are typically constrained in size, weight or form factor, smaller form factors increase options for physical designs while increasing area efficiency, lowering costs and lowering power consumption. Returning to the Nexus 2 example, Lattice claims a 3-5× reduction in size compared to similarly capable competitive solutions. This form factor reduction was accomplished by using the LUT4 architecture, optimizing the DSPs for INT8 data types and architecting the SERDES block to provide comparable competitive performance at lower power.
Given the critical nature of most edge AI devices, security must not be overlooked and can be assessed in two ways—how the design recovers from attacks and how it prevents them. Highly reliable FPGAs are able to minimize impact from attacks by reducing the downtime in the event of a disruption.
For example, Nexus 2’s designs for rapid boot times as explored earlier also help with instant-on configuration capability that helps the FPGA to recover quickly with negligible to minimal interruption. Preventing hacking attacks, on the other hand, is driven by the encryption algorithms supported natively by the FPGA. Due to the ever-evolving dynamic of cryptography and attack vectors, the key is ensuring that the device is able to support algorithms to deal with threats from day one to well into the intended life cycle of the platform. This allows for easy upgrades of newly supported algorithms when enhancements to current algorithms are inevitably made available.
As these small FPGA platforms have life cycles of 15 years or more, it is prudent that any platform currently being considered supports post-quantum cryptographic algorithms, or at least has the ability to be upgraded as such. Industry consensus estimates put the availability of cryptographically relevant quantum computers at around 2030, which is well within the life cycle of platforms using these FPGAs. The Nexus 2 not only supports proven ECDSA-521 and RSA4k authentication, but it also supports post quantum safe algorithms, such as AES-GCM and SHA-3.
As various industries assess how to implement AI, it is clear that cost, latency and local context awareness are critical requirements. Compared with AI in the cloud, or even in high-functioning devices, such as AI PCs or AI smartphones, the embedded edge AI battleground is not as much about raw speeds and feeds but the optimization of those resources to enable focused edge AI capabilities balanced with demanding design factors. As the industry matures, solutions like Lattice’s new Nexus 2 platform are pushing the boundaries of edge AI FPGA innovation. The good thing about innovation is that it is powered by competition, and it will be exciting to see what is next, not only from Lattice but from its competitors, as well.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|