Guest Blog by Steven K. Esser
We are excited to announce that our work demonstrating convolutional networks for energy-efficient neuromorphic hardware just appeared in the peer-reviewed Proceedings of the National Academy of Sciences (PNAS) of the United States of America. Here is an open-access link. In this work, we show near state-of-the-art accuracy on 8 datasets, while running at between 1200 and 2600 frames per second and using between 25 and 275 mW running on the TrueNorth chip. Our approach adapts deep convolutional neural networks, today’s state-of-the-art approach for machine perception, to perform classification tasks on neuromorphic hardware, such as TrueNorth. This work overcomes the challenge of learning a network that uses low precision synapses, spiking neurons and limited-fan in, constraints that allow for energy-efficient neuromorphic hardware design (including TrueNorth) but that are not imposed in conventional deep learning.
Low precision synapses make it possible to co-locate processing and memory in hardware, dramatically reducing the energy needed for data movement. Employing spiking neurons (which have only a binary state) further reduces data traffic and computation, leading to additional energy savings. However, contemporary convolutional networks use high precision (at least 32-bit) neurons and synapses. This high precision representation allows for very small changes to the neural network during learning, facilitating incremental improvements of the response to a given input during training. In this work, we introduce two constraints to the learning rule to bridge deep learning to energy-efficient neuromorphic hardware. First, we maintain a high precision “shadow” synaptic weight during training, but map this weight to a trinary value (-1, 0 or 1) implementable in hardware for operation as well as, critically, to determine how the network should change during the training itself. Second, we employ spiking neurons during both training and operation, but use the behavior of a high precision neuron to predict how the spiking neuron should change as network weights change.
Limiting neuron fan-in (the number of inputs each neuron can receive) allows for a crossbar design in hardware that reduces spike traffic, again leading to lower energy consumption. However, in deep learning no explicit restriction is placed on inputs per neuron, which allows networks to scale while still remaining integrated. In this work, we enforce this connectivity constraint by partitioning convolutional filters into multiple groups. We maintain network integration by interspersing layers with large topographic coverage but small feature coverage with those with small topographic coverage but large feature coverage. Essentially, this allows the network to learn complicated spatial features, and then to learn useful mixtures of those features.
Our results show that the operational and structural differences between neuromorphic computing and deep learning are not fundamental and indeed, that near state-of-the-art performance can be achieved in a low-power system by merging the two. In a previous post, we mentioned one example of an application running convolutional networks on TrueNorth, trained with our method, to perform real-time gesture recognition.
The algorithm is now in the hands of over 130 users at over 40 institutions, and we are excited to discover what else might now be possible!