Tuesday, December 6, 2022
Rain Neuromorphics has trained a deep learning network on an analog chip—a crossbar array of memristors—using the company’s analog-friendly training algorithms.
The process required many orders of magnitude less energy compared with today’s GPU systems. While Rain’s initial work has proven AI can be trained efficiently using analog chips, commercial realizations of the technology may still be a few years away.
In a paper co-authored with memristor pioneer Stanley Williams, Rain describes training single- and two-layer neural networks to recognize words written in braille. The setup uses a combination of two 64 x 64 memristor crossbar arrays (in this case, not the 3D ReRAM-based chip the company previously showed), combined with training algorithms using a technique called activity difference, which includes Rain’s earlier work on equilibrium propagation. Rain calls this hardware-algorithm combination memristor activity-difference energy minimization (MADEM).
Backpropagation, the training algorithm used almost exclusively in AI systems today, is incompatible with analog hardware since it is sensitive to the small variabilities and mismatches in on-chip analog devices. While compensation techniques have been used to make analog inference chips, these techniques have yet to prove successful for backpropagation-based training. Rain’s approach, which uses activity difference techniques, calculates local gradients instead of backpropagation’s repeated use of global gradients. The technique builds on previous work on equilibrium propagation training algorithms and is mathematically equivalent to backpropagation; in other words, it can be used to train mainstream deep learning networks.
Compared to training that uses backpropagation on a GPU, the time to train was reduced by two orders of magnitude (to tens of microseconds) and the energy consumed was reduced by five orders of magnitude (to hundreds of nanoJoules). Scaled-up versions of MADEM should still boast a four order of magnitude advantage in terms of energy consumption, per Rain’s projections.
“Over the course of the next 10 years, we intend to close the gap between what’s done today and the 100,000x that we know is possible,” Rain Neuromorphics CEO Gordon Wilson told EE Times. “The caveat is that this is not a product today. But this is a rigorous experiment that has done hardware-based measurements working with the noise of the system… making the analog work with you as opposed to fighting against it.”
Static device nonidealities are accounted for in the training process, while dynamical nonidealities, such as temporal stochasticity, can actually be used to improve performance. These nonidealities can be used for regularization—lowering the complexity of the neural network during training to avoid overfitting.
“Our goal is to have the same accuracy as backpropagation, to take all the wins that have been demonstrated in the digital world with deep learning, we want to be able to move all that to an ultra-efficient platform,” Wilson said. “To do that, you need an algorithm that’s as smart as backpropagation, and you need a hardware substrate that’s as scalable as the GPU, but with orders of magnitude less power consumption.”
Rain’s recent work has been enabled by hardware and algorithm co-design, a trend Wilson sees as critical to next-generation AI systems.
“The more components you examine in parallel, the more of a full stack approach you take, the more comprehensively you can reimagine the whole system,” he said.
Rain’s vision is a fully analog, asynchronous, ultra-low power, tileable, scalable chip with capacity for 100 billion parameters that can imitate the human brain. While this work used a crossbar memristor array, Rain’s hardware roadmap still includes migration to randomly connected ReRAM cells as the technology matures.
“This allows us to have near-term productization opportunities that are still really valuable but take out the risk of ReRAM for our first go-to-market, as well as allows us to maintain that long-term goal of a fully analog continuous asynchronous learning system,” Wilson said.
Training and inference on the same platform is a big part of Rain’s plan to enable robust intelligence at the edge. Future applications would be personalization, adaptivity, and the ability to generalize from past experiences; today’s robots and autonomous vehicles can’t be trained on every possible scenario, so they will need the ability to learn as they go along, Wilson argues. Alongside cost and energy efficiency, this is one of the key requirements for true autonomy.
Separately, Rain is working with Argonne National Labs to explore how its hardware could be used in Argonne’s particle accelerator. Experiments performed in the particle accelerator are monitored by X-ray sensors, with large amounts of data from these sensors typically transferred to a GPU cluster where AI identifies frames of interest that are relevant to the experiment.
Rain’s hardware could be installed next to the sensors to perform inference on the data without transferring to GPU servers. Continuous fine-tuning of the model is required to mitigate sensor drift, maintaining performance over time. In the future, this could be enabled by Rain’s on-chip training capabilities.
“We need this kind of fine tuning and continuous learning in more places than we originally realized—other potential partners are in electron microscopy, for example— since there are so many pieces of equipment that have a massive throughput of data where the ability to learn and fine tune is a necessary ingredient,” Wilson said.
Rain’s paper, “Activity-Difference Training of Deep Neural Networks using Memristor Crossbars” is here. Rain will also present a paper at IEDM this week discussing how its 3D ReRAM hardware design will exploit sparsity inherent to brain structures.
Copyright © 2023 CST, Inc. All Rights Reserved