• Skip to main content
  • Skip to primary sidebar

Dharmendra S. Modha

My Work and Thoughts.

  • Brain-inspired Computing
    • Collaborations
    • Videos
  • Life & Universe
    • Creativity
    • Leadership
    • Interesting People
  • Accomplishments
    • Prizes
    • Papers
    • Positions
    • Presentations
    • Press
    • Profiles
  • About Me

1014

November 14, 2012 By dmodha

Caption: Authors†: (Clockwise, starting at center top)
Dharmendra S. Modha, Myron D. Flickner, Emmett McQuinn, Steven K. Esser, Robert Preissl, Pallab Datta, Horst D. Simon, Rathinakumar Appuswamy, Theodore M. Wong, William P. Risk
(Photo Credit: Hita Bambhania-Modha)

Moments ago, IBM and LBNL presented the next milestone towards fulfilling the vision of DARPA SyNAPSE program at Supercomputing 2012.

TITLE: 

Compass: A scalable simulator for an architecture for Cognitive Computing

AUTHORS:

Robert Preissl
Theodore M. Wong
Pallab Datta
Myron D. Flickner
Raghavendra Singh
Steven K. Esser
Emmett McQuinn*
Rathinakumar Appuswamy*
William P. Risk
Horst D. Simon (LBNL)
Dharmendra S. Modha

*Since submitting the camera ready copy, these colleagues contributed significantly to visualization and dynamics.

ABSTRACT:

Inspired by the function, power, and volume of the organic brain, IBM is developing TrueNorth, a novel modular, scalable, non-von Neumann, ultra-low power, cognitive computing architecture. TrueNorth consists of a scalable network of neurosynaptic cores, with each core containing neurons, dendrites, synapses, and axons. To set sail for TrueNorth, IBM developed Compass, a multi-threaded, massively parallel functional simulator and a parallel compiler that maps a network of long-distance pathways in the macaque monkey brain to TrueNorth.

IBM and LBNL demonstrated near-perfect weak scaling on a 16 rack IBM Blue Gene/Q (262,144 processor cores, 256 TB memory), achieving an unprecedented scale of 256 million neurosynaptic cores containing 65 billion neurons and 16 trillion synapses running only 388× slower than real time with an average spiking rate of 8.1 Hz. By using emerging PGAS communication primitives, IBM also demonstrated 2× better real-time performance over MPI primitives on a 4 rack Blue Gene/P (16384 processor cores, 16 TB memory).  Here is PDF of final paper.

NEW NEWS: Since submitting the camera ready copy, using 96 Blue Gene/Q racks of the Lawrence Livermore National Lab Sequoia supercomputer (1,572,864 processor cores, 1.5 PB memory, 98,304 MPI processes, and 6,291,456 threads), IBM and LBNL achieved an unprecedented scale of 2.084 billion neurosynaptic cores containing 53×1010 neurons and 1.37×1014 synapses running only 1542× slower than real time. Here is PDF of IBM Research Report, RJ 10502.

SIGNIFICANCE:

The ultimate vision of the DARPA SyNAPSE program is to build a cognitive computing architecture with 1010 neurons and 1014 synapses. “The vision for the Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE) program is to develop electronic neuromorphic machine technology that scales to biological levels.” For reference, DARPA SyNAPSE BAA from 2008 is here.  This DARPA SyNAPSE metric was probably inspired by the following: Gordon Shepherd in The Synaptic Organization of the Brain estimates the number of synapses in the human brain as 0.6×1014 and Christof Koc in Biophysics of Computation: Information Processing in Single Neurons estimates the number of synapses in the human brain as 2.4×1014.

CLARIFICATION:

We have not built a biologically realistic simulation of the complete human brain. Rather, we have simulated a novel modular, scalable, non-von Neumann, ultra-low power, cognitive computing architecture at the scale of DARPA SyNAPSE metric of 1014 synapses that, in turn, is inspired by the number of synapses in the human brain. Computation (“neurons”), memory (“synapses”), communication (“axons”, “dendrites”) are mathematically abstracted away from biological detail towards engineering goals of maximizing function (utility, applications) and minimizing cost (power, area, delay) and design complexity of hardware implementation.

PAPER STATUS:

Per SC12 website:  “The SC12 Technical Papers program received 472 submissions covering a wide variety of research topics in high performance computing. We followed a rigorous peer review process with a newly introduced author rebuttal period, careful management of conflicts, and four reviews per submission (in most cases). At a two-day face-to-face committee meeting on June 25-26 in Salt Lake City, over 100 technical paper committee members discussed every paper and finalized the selections. At the conclusion of the meeting, SC12 Technical Papers had accepted 100 papers, reflecting an acceptance rate of 21 percent.” Six of the 100 accepted papers, including this one, were selected as finalists for the Best Paper Award.

PERSPECTIVE:

Through last 6 years, powered by Blue Gene/L, Blue Gene/P, and Blue Gene/Q, and with support from DARPA and DOE / NNSA / LLNL, the simulations have scaled from 4,096 processor cores and 1 TB main memory in February 2007 to 8,192 processors and 4 TB of main memory in July 2007 to 32,768 processor cores and 8TB main memory in November 2007 to 147,456 processor cores and 144 TB of main memory in November 2009 to 262,144 processor cores and 256 TB main memory in April 2012 to, finally, 1,572,864 processor cores and 1.5 PB main memory in October 2012.

Previously, we have demonstrated a neurosynaptic core and some of its applications. We have also compiled the largest long-distance wiring diagram in the monkey brain. Now, imagine a network with over 2 billion of these neurosynaptic cores that are divided into 77 brain-inspired regions with probabilistic intra-region (“gray matter”) connectivity and monkey-brain-inspired inter-region (“white matter”) connectivity. The new paper simulates dynamics of such a network on Top #2 supercomputer, LLNL’s Sequoia, and drives the dynamics to a self-critical state. This fulfills a core vision of DARPA SyNAPSE project to bring together nanotechnology, neuroscience, and supercomputing to lay the foundation of a novel cognitive computing architecture that complements today’s von Neumann machines.

APPLICATIONS OF COMPASS:

The Compass simulator is an all-purpose “swiss-army knife” to pursue novel architectures, algorithms, and applications. Compass is indispensable for (a) verifying TrueNorth correctness via regression testing, (b) studying TrueNorth dynamics, (c) benchmarking inter-core communication topologies, (d) demonstrating applications in vision, audition, realtime motor control, and sensor integration, (e) estimating power consumption, and (f) hypotheses testing, verification, and iteration regarding neural codes and function. We have used Compass to demonstrate numerous applications of the TrueNorth architecture, such as optic flow, attention mechanisms, image and audio classification, multi-modal image audio classification, character recognition, robotic navigation, and spatio-temporal feature extraction. These applications will be published separately.

THANKS:

IBM would like to thank DARPA, DARPA DSO, SyNAPSE Program Manager: Dr. Gill A. Pratt, and Former SyNAPSE Program Manager: Dr. Todd Hylton. The research reported in this presentation was sponsored by Defense Advanced Research Projects Agency, Defense Sciences Office (DSO), Program: Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE), Issued by DARPA/CMO under Contract No. HR0011-09-C-0002.

IBM and LBNL would like to thank Michel McCoy and Tom Spelce for access to the Sequoia Blue Gene/Q supercomputer at Lawrence Livermore National Laboratory and the DOE NNSA Advanced Simulation and Computing Program for time on Sequoia. Lawrence Livermore National Laboratory is operated by Lawrence Livermore National Security, LLC, for the U.S. Department of Energy, National Nuclear Security Administration under Contract DE-AC52-07NA27344.

The authors are indebted to Fred Mintzer for access to IBM Blue Gene/P and Blue Gene/Q at the IBM T.J. Watson Research Center and to George Fax, Kerry Kaliszewski, Andrew Schram, Faith W. Sell, Steven M. Westerbeck for access to IBM Rochester Blue Gene/Q, without which this paper would have been impossible.

The authors thank Filipp Akopyan, Rodrigo Alvarez-Icaza, John Arthur, Andrew Cassidy, Daniel Friedman, Subu Iyer, Bryan Jackson, Rajit Manohar, Paul Merolla, and Jun Sawada for their collaboration on the TrueNorth architecture, and our university partners Stefano Fusi, Rajit Manohar, Ashutosh Saxena, and Giulio Tononi as well as their research teams for their feedback on the Compass simulator.

Finally, the authors would like to thank David Peyton for his expert editorial assistance in revising the manuscript.

TYPOS:

Section III, Listing 1, move “threadAggregate( remoteBuf, remoteBufAgg)” immediately
below the line “if ( threadID == 0 ) {“

Section V.C., “80/20” should be “20/80”

TO LEARN MORE:

VIDEOS:

Keynote at DAC (~1 hour)

The Cognitive Systems Era (~5min)

PAST IBM PRESS RELEASES:

DARPA SyNAPSE Phase 0

DARPA SyNAPSE Phase 1

DARPA SyNAPSE Phase 2

COMPASS ALGORITHM:

Each MPI process executes the following algorithmic flow with 1 master thread and 64 slave threads.
For the flagship 1014 run, there were 98,304 MPI processes and 6,291,456 threads.

Filed Under: Accomplishments, Brain-inspired Computing, Papers

Building Block of a Programmable Neuromorphic Substrate: A Digital Neurosynaptic Core

June 19, 2012 By dmodha

Last week, IBM-Cornell SyNAPSE Team published the following paper:

Citation: John V. Arthur, Paul A. Merolla, Filipp Akopyan, Rodrigo Alvarez-Icaza, Andrew Cassidy, Shyamal Chandra, Steven K. Esser, Nabil Imam, William Risk, Daniel Rubin, Rajit Manohar, and Dharmendra S. Modha, "Building Block of a Programmable Neuromorphic Substrate: A Digital Neurosynaptic Core", International Joint Conference on Neural Networks, June 2012.

Abstract: The grand challenge of neuromorphic computation is to develop a flexible brain-like architecture capable of a wide array of real-time applications, while striving towards the ultra-low power consumption and compact size of biological neural systems. To this end, we fabricated a key building block of a modular neuromorphic architecture, a neurosynaptic core. Our implementation consists of 256 integrate-and-fire neurons and a 1,024×256 SRAM crossbar memory for synapses that fits in 4.2mm2 using a 45nm SOI process and consumes just 45pJ per spike. The core is fully configurable in terms of neuron parameters, axon types, and synapse states and its fully digital implementation achieves one-to-one correspondence with software simulation models. One-to-one correspondence allows us to introduce an abstract neural programming model for our chip, a contract guaranteeing that any application developed in software functions identically in hardware. This contract allows us to rapidly test and map applications from control, machine vision, and classification. To demonstrate, we present four test cases (i) a robot driving in a virtual environment, (ii) the classic game of pong, (iii) visual digit recognition and (iv) an autoassociative memory.

Filed Under: Accomplishments, Brain-inspired Computing, Papers

Implementation of olfactory bulb glomerular-layer computations in a digital neurosynaptic core

June 6, 2012 By dmodha

Today, Cornell – IBM SyNAPSE Team published the following paper:

Citation: Imam N, Cleland TA, Manohar R, Merolla PA, Arthur JV, Akopyan F and Modha DS (2012) Implementation of olfactory bulb glomerular-layer computations in a digital neurosynaptic core. Front. Neurosci. 6:83. doi: 10.3389/fnins.2012.00083

Abstract: We present a biomimetic system that captures essential functional properties of the glomerular layer of the mammalian olfactory bulb, specifically including its capacity to decorrelate similar odor representations without foreknowledge of the statistical distributions of analyte features. Our system is based on a digital neuromorphic chip consisting of 256 leaky-integrate-and-fire neurons, 1024 × 256 crossbar synapses, and address-event representation communication circuits. The neural circuits configured in the chip reflect established connections among mitral cells, periglomerular cells, external tufted cells, and superficial short-axon cells within the olfactory bulb, and accept input from convergent sets of sensors configured as olfactory sensory neurons. This configuration generates functional transformations comparable to those observed in the glomerular layer of the mammalian olfactory bulb. Our circuits, consuming only 45 pJ of active power per spike with a power supply of 0.85 V, can be used as the first stage of processing in low-power artificial chemical sensing devices inspired by natural olfactory systems.

Filed Under: Accomplishments, Brain-inspired Computing, Papers

The Cognitive Systems Era

June 2, 2012 By dmodha

Youtube Video (5 minutes and 16 seconds) describing my team’s work in the context of IBM’s Cognitive Systems Era: http://www.youtube.com/watch?v=gQ3HEVelBFY

Filed Under: Accomplishments, Brain-inspired Computing, Presentations, Press

ASYNC 2012: A Digital Neurosynaptic Core Using Event-Driven QDI Circuits

May 10, 2012 By dmodha

Building on recently published cognitive computing chip technology, this week at ASYNC 2012: IEEE International Symposium on Asynchronous Circuits and Systems Cornell-IBM team published a new paper that won the Best Paper Award.  

TITLE: A Digital Neurosynaptic Core Using Event-Driven QDI Circuits

AUTHORS: Nabil Imam, Filipp Akopyan, John Arthur, Paul Merolla, Rajit Manohar, Dharmendra S Modha

ABSTRACT: We design and implement a key building block of a scalable neuromorphic architecture capable of running spiking neural networks in compact and low-power hardware. Our innovation is a configurable neurosynaptic core that combines 256 integrate-and-fire neurons, 1024 input axons, and 1024×256 synapses in 4.2mm2 of silicon using a 45nm SOI process. We are able to achieve ultra-low energy consumption 1) at the circuit-level by using an asynchronous design where circuits only switch while performing neural updates; 2) at the core-level by implementing a 256 neural fanout in a single operation using a crossbar memory; and 3) at the architecture level by restricting core-to-core communication to spike events, which occur relatively sparsely in time. Our implementation is purely digital, resulting in reliable and deterministic operation that achieves for the first time one-to-one correspondence with a software simulator. At 45pJ per spike, our core is readily scalable and provides a platform for implementing a wide array of real-time computations. As an example, we demonstrate a sound localization system using coincidence-detecting neurons.

Filed Under: Accomplishments, Brain-inspired Computing, Papers

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 16
  • Page 17
  • Page 18
  • Page 19
  • Page 20
  • Interim pages omitted …
  • Page 49
  • Go to Next Page »

Primary Sidebar

Recent Posts

  • Computer History Museum Interview
  • EE Times Interview by Sunny Bains
  • SiLQ: Simple Large Language Model Quantization-Aware Training
  • Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole
  • Breakthrough edge AI inference performance using NorthPole in 3U VPX form factor

Archives by Month

  • 2025: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2024: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2023: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2022: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2020: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2019: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2018: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2017: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2016: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2015: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2014: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2013: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2012: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2011: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2010: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2009: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2008: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2007: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2006: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Copyright © 2025