IBM used AMD’s other not-so-secret weapon to deliver yet another brain-like chip — NorthPole is faster and uses less energy than Nvidia’s best AI GPU and it’s only the beginning

Key Takeaways:

– IBM’s NorthPole chip is 25 times more power efficient than commonly used GPUs and CPUs.
– The chip is built on the TrueNorth architecture and achieves high performance in terms of latency and space required to compute.
– The chip has memory embedded in each of its 256 cores, eliminating the need for separate memory connections.
– NorthPole comprises 22 billion transistors and can perform 2,048 operators per core.
– The chip eliminates the Von Neumann bottleneck by blurring the boundary between compute and memory.
– AMD has also tapped into the concept of combining memory and compute on a single component.
– NorthPole is well-suited for emerging AI use cases and edge applications that require real-time data processing.

TechRadar:

IBM’s NorthPole chip – nearly ten years in the making – is going from strength to strength and achieved a new milestone with researchers publishing a set of fantastic benchmarking results in the journal Science.

The 12nm chip, built on the TrueNorth architecture, is 25-times more power efficient than commonly used 12nm GPUs and 14nm CPUs. This is according to testing on the ResNet-50 model, and was measured as the number of frames interpreted per joule of power. 

Source link

AI Eclipse TLDR:

IBM’s NorthPole chip, which took nearly ten years to develop, has achieved a significant milestone with researchers publishing impressive benchmarking results in the journal Science. The chip, built on the TrueNorth architecture and using a 12nm process, is 25 times more power efficient than commonly used 12nm GPUs and 14nm CPUs. This efficiency was measured by testing the chip on the ResNet-50 model and calculating the number of frames interpreted per joule of power. In addition to its power efficiency, the NorthPole chip also outperforms all major architectures, including a GPU implemented using a 4nm process.

The NorthPole chip’s success can be attributed to its unique architecture and features. It incorporates memory on the chip itself, embedded in each of the chip’s 256 cores. With 22 billion transistors, the chip’s cores can perform 2,048 operators per core. This design eliminates the Von Neumann bottleneck, which refers to the delays caused by data having to travel between the CPU and RAM in most systems. As a result, the NorthPole chip performs much faster than the best GPUs currently available, including AI-centric graphics cards by Nvidia.

IBM Research’s Dharmendra Modha described NorthPole as blurring the boundary between compute and memory. At the level of individual cores, it appears as memory-near-compute, and from outside the chip, it appears as an active memory. This unique architecture makes the NorthPole chip ideal for emerging AI use cases, such as computer vision and natural language processing. It is also well-suited for edge applications that require real-time data processing.

IBM’s NorthPole chip showcases the concept of combining memory and compute on a single component, which has also been explored by AMD with its processor-in-memory (PIM) approach. The NorthPole chip’s success in achieving high performance with power efficiency makes it a promising technology for future AI applications.