Forum latest

Two billion-transistor beasts: POWER7 and Niagara 3
Written by Daniel   
Wednesday, 10 February 2010 18:52

From Ars Tecnica

A 300mm Power 7 processor wafer

In years past, an ISSCC presentation on a new processor would consist of detailed discussion of the chip's microarchitecture (pipeline, instruction fetch and decode, execution units, etc.), along with at least one shot of a floorplan that marked out the location of major functional blocks (the decoder, the floating-point unit, the load-store unit, etc.). This year's ISSCC is well into the many-core era, though, and with single-chip core counts ranging from six to 16, the only elements you're likely to see in a floorplan like the two below are cores, interfaces, and switches. Most of the discussion focuses on power-related arcana, but most folks are interested in the chips themselves.


In this short article, I'll walk you through the floorplan of two chips with similar transistor counts—the Sun's Niagara 3 and IBM's POWER7. Most CPU geeks will already know a lot of the information I'll give below, but many readers will appreciate having it all together in one place.
Niagara 3: threads and I/O
Sun's Niagara 3

Sun's 1 billion-transistor, 16-core Niagara 3 processor is a great example of modern multiprocessor-turned-SoC (system on a chip). Everything about this design is focused on pushing large numbers of parallel instruction streams and data streams through the processor socket at once. The shared cache is small, the shared pipes are wide, and the end result is a chip that's all about maintaining a high rate of flow, and not one that's aimed at collecting a large pile of data and chipping away at it with heavy equipment.

Each of the 16 individual SPARC cores that make up Niagara 3 support up to eight simultaneous threads of execution, for a total of 128 threads per chip. Logically, the chip is laid out so that all of the cores communicate with a unified 6MB L2 cache via a crossbar switch that's placed in the middle of the chip. This combination of cores and L2 connected via a switch forms the basic compute architecture of the SoC.

So that the chip can talk to the outside world, the L2 caches are connected to a variety of I/O interfaces: memory, PCIe, 1G/10G Ethernet, and coherency links. All told, those links can push a total of 2.4Tb/s worth of data through a single Niagara 3 socket—that's a lot of bandwidth, but you need it to feed that many threads. Let's take a quick look at each of these I/O links in turn.

[More...] {Comments}


See also

None found.

Hardware | Windows | Linux | Security | Mobile Devices | Gaming
Tech Business | Editorial | General News | folding@home

Forum | Download Files

Copyright ©2001 - 2012, AOA Forums.  All rights reserved.

Alliance of Overclocking Arts

Links monetized by VigLink

Don't Click Here Don't Click Here Either