• Skip to main content
  • Skip to primary sidebar

Dharmendra S. Modha

My Work and Thoughts.

  • Brain-inspired Computing
    • Collaborations
    • Videos
  • Life & Universe
    • Creativity
    • Leadership
    • Interesting People
  • Accomplishments
    • Prizes
    • Papers
    • Positions
    • Presentations
    • Press
    • Profiles
  • About Me

Archives for September 2025

Computer History Museum Interview

September 7, 2025 By dmodha

Computer History Museum interview on the occasion of NorthPole’s induction into the Museum. Other interviewees include: John Backus (Fortran), Brian Kernighan (UNIX), Robert Metcalfe (Ethernet, 3Com), Gordon Moore (Moore’s Law), Robert Kahn (TCP/IP), Douglas Engelbart (hypertext), Ronald Rivest (RSA), John McCarthy (LISP), Donald Knuth (analysis of algorithms), James Gosling (JAVA), John Hennessy (RISC), Ken Thompson (UNIX, B), Rodney Brooks (robotics).

Filed Under: Press

EE Times Interview by Sunny Bains

September 7, 2025 By dmodha

Sunny Bains interviewed me for Brains and Machines. It captures our journey through DARPA SyNAPSE, TrueNorth, and NorthPole. Listen here.

Filed Under: Press

SiLQ: Simple Large Language Model Quantization-Aware Training

September 6, 2025 By dmodha

Thrilled to share the latest work from the IBM Research NorthPole Team pushing the cutting edge of quantized large language model performance. In a recent paper, we introduce a new quantization recipe and apply it to 8 billion parameter Granite and Llama models. We demonstrate these models with 8-bit activations and cache and 4-bit weights showing minimal accuracy degradation on 3 leader boards spanning 20 distinct tasks.

Our method is high accuracy, outperforming all prior published quantization methods on the models and precisions examined, is simple, able to reuse existing training code after adding appropriate quantization and knowledge distillation, and is relatively low-cost, able to reuse existing training data or publicly available datasets, and requiring an increase in total training budget of less than 0.1%. We believe that this will be a powerful enabling tool for deploying models on ultra-low-latency inference accelerators like NorthPole, greatly enhancing the performance of latency critical applications such as interactive dialog and agentic workflows.

The paper, written with co-authors Jeffrey McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra Modha, can be found here.

Filed Under: Papers

Primary Sidebar

Recent Posts

  • Computer History Museum Interview
  • EE Times Interview by Sunny Bains
  • SiLQ: Simple Large Language Model Quantization-Aware Training
  • Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole
  • Breakthrough edge AI inference performance using NorthPole in 3U VPX form factor

Archives by Month

  • 2025: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2024: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2023: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2022: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2020: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2019: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2018: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2017: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2016: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2015: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2014: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2013: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2012: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2011: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2010: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2009: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2008: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2007: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  • 2006: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Copyright © 2025