tfh
Volume 18, Number 32 -- September 14, 2009

The Feeds and Guessed Speeds of Power7

Published: September 14, 2009

by Timothy Prickett Morgan

Just before The Four Hundred went on hiatus, I told you that IBM would be making a presentation about the forthcoming Power7 processors at the Hot Chips 21 conference hosted by the IEEE at Stanford University. And indeed, the top techies working on the eight-core Power7 processors did raise the curtains a little bit about the future brains of the Power Systems lineup, which are due in 2010.

IBM confirmed to me a little more than a year ago that the Power7 chips would have at least eight cores and would be made using a 45 nanometer process in Big Blue's East Fishkill, New York, wafer baker. And in August, as IBM was preparing for the Hot Chips event and knew the analysts prebriefed about Power7 would start talking to the press whether or not it wanted them to, the company said that Power7 would come with four, six, or eight cores with as many as four instruction threads per core. The company also said that it would support up to 1,000 logical partitions per server on Power7-based machines, and that current Power 570 and Power 595 machines would be able to upgrade to the Power7 equivalents while preserving serial numbers. This, as I explained in detail in August, makes the bean counters happy because existing infrastructure does not have to be written off, and it makes IBM happy because now it can sell current top-end Power6 or Power6+ machines to customers who know they can upgrade to Power7 later if they need to.

So, what new do we now know about the Power7 chip? Plenty. First, according to the presentation given by Ron Kalla, chief engineer of the Power7 chip, and Balaram Sinharoy, chief core architect for the Power7, the chip will be made using a 45 nanometer lithography process that incorporates copper and silicon-on-insulator transistor doping techniques. The chip, according to Kalla, will have 1.2 billion transistors and will offer the "equivalent function" of a chip with 2.7 billion transistors. (I am not sure what this means, to be honest.) The chip weighs in at 567 square millimeters in area, and is just a little bit wider than it is deep.

The most interesting thing about the Power7 chip is that it includes 32 MB of embedded DRAM as a shared L3 cache for the processor cores. This is the same size cache that IBM had been offering in multi-chip packaging with Power5 and Power6 generations of machines, but now it is not only on the chip, but right smack dab in the middle of it, acting as a huge information exchange between the eight cores on the chip. By my estimates, looking at the picture of the fully loaded Power7 chip, the cores account for about 53 percent of the surface area of the chip, the local SMP links, remote SMP links and I/O, and two four-channel DDR3 main memory controllers take up about 35 percent of the space, leaving the L3 cache and chip interconnect to eat up about 12 percent of the real estate on the chip.

The Power7 core has a dozen execution units, which are binary compatible with previous Power6 and Power6+ chips, and a revamped instruction pipeline that is necessary to cope with the many threads and many cores in the chip. The Power6 chips had two cores, each with two threads. The Power7 chip will have, at 32 threads, eight times the virtual instruction streams of the Power6 per socket. So things have to change to accommodate that. The Power7 core has two fixed point units, two load store units, four double-precision floating point units, one decimal floating point unit (for doing the kind of money math popular on business applications), and one vector unit (for doing the matrix math popular in nuclear simulations and weather forecasts). The cores support out-of-order instruction execution, as prior generations of power chips have, and have 32 KB of L1 instruction cache, 32 KB of L1 data cache, and a 256KB of L2 cache.

As for the L3 cache on chip, which could account for a lot more of the transistors than the eDRAM area would imply, each core has an L-shaped segment of that cache that is loosely affiliated with it that weighs in at 4 MB. Other cores can access the L3 cache segments from other cores, which is a lot quicker on the chip than going out through the memory controller and out to the DDR3 main memory. The difference between searching a remote L3 cache segment and going to main memory is like the difference between a heartbeat and waiting for a cab in New York on a rainy day.

As with prior Power designs, IBM is really concerned with pumping up the memory and SMP bandwidth. IBM says that it can get 100 GB/sec of sustained memory bandwidth per chip socket, and that the SMP interconnect of a 32 socket machine has an aggregate bandwidth of 360 GB/sec.

Here's where it gets interesting. Rather than make one Power7 chip that is supposed to fit all jobs, IBM will be tweaking the chips that come out of the fab. Some of them will have the full eight DDR3 memory channels activated, while others will only sport four. Some will have half-width SMP buses, while others will sport the standard widths and others still will have four-wide SMP buses. The precise way these are mixed and matched with the core count is not clear, but IBM did say that for blade and rack servers with two or four processor sockets, the Power7 chips would have only one memory controller activated and three 4-byte local SMP links, all in a single chip organic package. That might imply a maximum of four cores for these machines, if I understand the schematics correctly, but six cores is also possible. I would guess that these chips will be half-duds, with maybe 8 MB or 16 MB of the L3 cache working.

For midrange and high-end Power Systems boxes using the Power7 chip, both memory controllers plus three 8-byte local links and two 8-byte remote links. These will be packaged up in single-chip class ceramic packages and will be used in the standard Power Systems product line. It is my guess that IBM will offer six or eight core variants here, and with 16 MB or 32 MB of L3 cache activated.

There is another variant of the Power7 chip that will not appear in standard machines and which seems to be destined for the "Blue Waters" massively parallel supercomputer being built by IBM for the National Center for Supercomputing Applications at the University of Illinois. This is a multichip module (MCM) that will be a standalone SMP node that has four entire Power7 chips, with all eight of their memory controllers activated, plunked into a single ceramic package. These chips will all sport three 16-byte local links that glue the chips together. And there is every reason to believe these chips, which were developed under the code-name "Q7," will end up running at higher clock speeds than the MCM packaging should allow.

As part of the Power7 design, cores on the chip can be dynamically turned on or off (provided IBM has enabled them, of course), and the core frequencies of the chips can also be turned up and down in an effort to reallocate energy. The threads on the chips can also be enabled and disabled as needed, ranging from a low of one thread up to the full four threads. I wonder if IBM will allow machine with fewer threads turned on to run at slightly higher clock speeds?

From what I can see in the presentation, the individual per-core performance of the Power7 cores will go up by about 20 percent, even though I expect clock speeds in the range of 3 GHz to 4 GHz, much less than the 4.2 GHz to 5 GHz of the Power6 and Power6+ cores. Multithreading is the answer. And so is adding cores. Kalla said that IBM would be able to boost the per-socket performance of a Power Systems box by a factor of four, getting Big Blue back on the roadmap to where it needs to be after the Power6+ totally fizzled in terms of delivering the performance expected. I still think IBM expected to get clock speeds up to 6 GHz or higher with the Power6+ and opted instead to double up the cores to boost system oomph. But no one will ever cop to that.

In next week's issue, I will take a stab at what the future Power7 machines might look like in terms of specs and performance compared to the current Power Systems lineup.


RELATED STORIES

IBM to Reveal Power7 Secrets at Hot Chips

Power 7: Lots of Cores, Lots of Threads

IBM Touts Power Systems Prowess on SAP Tests

With No Power6 QCMs, IBM Waits for Power7

IBM Launches Power6+ Servers--Again

Come On Out, Power6+, You Win

Power vs. Nehalem: Time to Double Up and Double Down

Power vs. Nehalem: Scalability Is So 1995, Cash is So 2009

IBM and Resellers Do the iLoyalty Blitz

i Roadmaps: Here Be Dragons

IBM Doubles the Cores on Midrange Power Systems

Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium

More Power7 Details Emerge, Thanks to Blue Waters Super

Intel's Nehalems to Star at IDF, AMD Pitches Shanghai

Intel Talks Up X64, Itanium Roadmaps Ahead of IDF

Intel Announces First "Penryn" Xeon Processors

Bang for the Buck: Raising the System iQ



                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
NEW GENERATION SOFTWARE

Think You Need a Different Platform and
Database to Get Modern BI and
Report Development Tools?

See how easily you can develop
Web Reports with Drill Down, Live Dashboards,
OLAP models, MS Office Output, Portals,
and more over DB2, SQL Server, Oracle, and
other data sources without transferring your
data off the IBM i or using ODBC.
Give your management exciting,
visible results fast.

With NGS-IQ, only you and your data will know you're running on the IBM i.

Schedule a Web Demo or Trial.
Call 800 824-1220


Editor: Timothy Prickett Morgan
Contributing Editors: Dan Burger, Joe Hertvik, Brian Kelly, Shannon O'Donnell,
Mary Lou Roberts, Victor Rozek, Kevin Vandever, Hesh Wiener, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

Manta Technologies:  Fall Sale on i training courses! Order by October 15 and SAVE 25%
BCD:  Webinar, Sept. 23 - Rapidly Web Enable your IBM i 5250 Applications in a Cost Conscious Market
COMMON:  Celebrate our 50th anniversary at annual conference, May 2 - 6, 2010, in Orlando

 

 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
The iSeries Express Web Implementer's Guide: List Price, $49.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
Can the AS/400 Survive IBM?: List Price, $49.00
Chip Wars: List Price, $29.95


 
Four Hundred Stuff
Managed File Transfer: A New Product Category That's Here to Stay

IBM to Formally Announce EGL Community Edition Today

Linoma Introduces MFT Software for External Exchanges

SEQUEL Updates i OS Time and Date Override Software

Cosyn Augments BPCS Accounting with AP Minder

Four Hundred Guru
Use the Dup Key in Subfiles

An Overview of User-Defined Types in DB2 for i

Admin Alert: The Road to Live CBU Fail Over, Part 1

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

System i PTF Guide
September 5, 2009: Volume 11, Number 36

August 29, 2009: Volume 11, Number 35

August 22, 2009: Volume 11, Number 34

August 15, 2009: Volume 11, Number 33

August 8, 2009: Volume 11, Number 32

August 1, 2009: Volume 11, Number 31

July 25, 2009: Volume 11, Number 30

TPM at The Register
Sun's Sparc server roadmap revealed

IDC outs the worst quarter in server history

Storage hardware crawls out from under melted economy

EMC quadruples investments in India

Blade Network nabs $10m in funding

Dell shoots low for SMBs

ATIC ponies up $3.9bn to buy Chartered

IBM reaffirms bright profit picture

Semiconductor sales rise 5.3% in July

IT shops rank servers on downtime

US jobless rate climbs again in August

Umpteen tools to fight VM sprawl

THIS ISSUE SPONSORED BY:

New Generation Software
Infinite Software
BCD
Computer Keyes
WorksRight Software


Printer Friendly Version


TABLE OF CONTENTS
The Feeds and Guessed Speeds of Power7

Server Makers Stomach the Worst Quarter in History

Training for the Future: An IT Degree in Energy Efficiency

As I See It: The Future in Parallel

IBM Gets Less Restrictive with Power ISV Rebates

But Wait, There's More:

COMMON RiPS: A Good Idea Needing a Better Acronym . . . EU Haunts Oracle-Sun, Oracle Taunts IBM . . . IBM Mothballs Older Versions of Host Integration Server . . . Vendors Go Virtual with Annual User Conferences . . . Greater Responsibility a Necessary Part of Vlok's Vision . . .

The Four Hundred

BACK ISSUES




 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2009 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement