IBM Raises The Curtain A Little On Future Power Processors
November 13, 2024 Timothy Prickett Morgan
We commented back in August when the Hot Chips 2024 conference was underway at Stanford University that it was odd that IBM did reveal some of the speeds and feeds of the Power11 processor that we all know is coming in 2025. And when the TechXchange 2024 partner event was held in Las Vegas a few weeks ago, we were pretty sure that IBM was working on some sort of accelerated system for GenAI applications and a week later we remembered that the obvious pairing was a Power System host with scads of IBM’s own “Spyre” AI accelerators running WatsonX models or other open source ones like the Llama family from Meta Platforms.
This, we cleverly thought, was what Big Blue talked to partners about in secrecy at TechXchange, and we figured any revelations about the Power11 processor would come in the new year.
Nope. It’s happening today, and we are getting a lot of the information that IBM disclosed under embargo to partners and customers.
And the reason is simple: Power11 may not be what many people are expecting. It is not going to use a chiplet architecture but rather lay the foundation for one that will be used in the Power Next processor, presumably to be called the Power12. But having said that, the Power11 that is going to be delivered at the beginning of the third quarter of 2025 is precisely where IBM would have gone to by that time – and despite the difficulties and delays that were mostly due to GlobalFoundries failing to deliver on its 10 nanometer and 7 nanometer manufacturing processes, which forced IBM to partner with Samsung to deliver a very different Power10 than was originally planned. And, as it turns out, one that was better suited to its actual AIX and IBM i customers and one that did not reflect an IBM trying to chase the X86 and Arm server CPU makers in their mad dash to add an increasingly large number of cores to a server CPU socket for the HPC and then AI markets.
We went over all of the roadmap changes in detail back in June 2021 when IBM sued GlobalFoundries for not making good on its promises. This was after IBM had amassed a sufficient supply of the Power9 processors etched with its 14 nanometer processes, which Big Blue helped GlobalFoundries perfect and bring to market. A roadmap refresher is in order so you have a baseline against which to measure Power11 and Power12: This was the plan, we think, when IBM and GlobalFoundries decided to skip 10 nanometers and jump to 7 nanometers. As you can see, Power10 was essentially two Power9 chips with microarchitecture changes and a process shrink. When that all fell apart, and IBM walked away from the traditional supercomputer business and support for NVLink memory coherence to Nvidia GPUs – we think Nvidia had its own Arm server chip aspirations and probably wanted a fortune to license NVLink, but that is a guess – then it decided to use the time to create a Power10 chip that was much better suited to its enterprise customers and their significant but not insane AI processing needs.
And then here is what we got:
So now, instead of a pair of Power10 chiplets with 24 cores per socket etched with GlobalFoundries 7 nanometer processes that came out in early 2020, we got a Power10 chip implemented in Samsung 7 nanometer processes with 16 cores per die (but only 15 of them exposed) that came out in the fall of 2021 for high-end machines and in July 2022 for entry and midrange machines. The truth is, the coronavirus pandemic that was declared in March 2020 would have totally screwed up the original Power10 rollout using GlobalFoundries chips. So, yes, there was economic damage when GlobalFoundries pulled the plug on 10 nanometer and then 7 nanometer processes, leaving IBM in the lurch, But economic damage was coming anyway, and IBM put the delay caused by the move to Samsung 7 nanometer processes to good use and essentially created a Power10 chip that pulled forward a lot of ideas from the Power11 plan.
This includes a completely revamped Power instruction set architecture and a brand new core design that includes not just vector units but also matrix math units for accelerating AI and HPC workloads. Thus far, while AMD and the Arm collective have vector units in their core designs, only Intel has a matrix unit in its Xeon 5 and Xeon 6 cores. In fact, IBM beat the entire market to this advancement. And, it brought strong eight-threaded cores to market that suited the needs of its enterprise customers in terms of single-threaded performance, the number of threads per core, the number of threads per socket, and the number of sockets per system. Power10 is second to none on these fronts, which is one of the reasons why Power10 systems are increasingly preferred for SAP HANA in-memory databases and, indeed, any relational databases underpinning ERP, CRM, and SCM back office systems. AMD runs out at two sockets, Intel really at four and sometimes eight depending on the OEM making the servers. Yes, they have more cores, but not more shared memory, not more clock speed, not more memory and I/O bandwidth, and not more performance per socket for certain kinds of workloads. Yes, they are cheaper, but what IBM is doing is hard and not really a volume play, so it is necessarily more expensive.
And so now, here is the roadmap for Power11 and what we presume is Power12:
“So fundamentally, we are starting from a really good place,” says Bill Starke, distinguished engineer at Power Systems and the chief architect of the Power10 and Power11 processors, tells The Four Hundred. “So you are not seeing us completely change everything up this time. The Power 11 architecture is an extension built off of the Power10 architecture. Getting to our DDR5 version the OMI memory, we’re getting a massive uplift, and I’m even more excited about OMI than I was when we came out with the original OMI with Power10. So that’s a key element that goes hand in hand, optimized with the Power 11 system architecture. The Spyre accelerator and the evolution of the AI story is also important. Spyre will be a piece of the Power11 portfolio and beyond.”
Power11 is not going to be etched in 5 nanometer processes, but rather be based on a refined 7 nanometer process from Samsung that allows IBM to crank the clocks and get better thermals.
We are going to get into the memory architecture and AI extensions for Power11 separately and give them the justice they are due. Fear not. This story is just about the roadmap.
Which brings us to Power Future or Power Next or Power12 if you want to be bold about it. We have been talking to Starke about chiplet architectures for years, and IBM has been shipping chiplet implementations of the Power line since way back in 2005 with the Power5+ processors. So this is old hat for Big Blue.
More recently, there have been dual-chip module (DCM) implementations of Power9 and Power10 processors, but these are just putting two whole chips into a single socket, not breaking the socket into independent compute, memory, and I/O elements and then using them in different ratios to meet different needs.
That, we think, is coming with Power12, and frankly we expected it with Power11. But we think that IBM wants to work with Samsung on 2.5D interposer technology and the chiplet design along with what we think is a process shrink to 3 nanometers rather than the 5 nanometers we all expected for 2025. Starke did not confirm any of this, of course, but did say that the move from 7 nanometers down to 5 nanometers did not provide enough benefit for IBM to do it – now when it could work on other things to boost real-world performance for IBM i and AIX shops.
Exactly how this will all play out has not been divulged, but Starke gave us some hints and we will be going through these in follow-up stories. Stay tuned.
RELATED STORIES
Power10 Keeps Plugging Along As Power11 Looms For 2025
The Long And IBM i Road That Leads To Your Door
An Update On Power From POWERUp 2023
It’s A Good Thing For IBM That Samsung Makes Chips And Also Runs A Foundry
The Power10 Machines That Will Take IBM i To 2025
The Big Iron Customers That The Power E1080 Is Aimed At
IBM Drops Power10 Into Big, Bad Iron First
Balancing Supply And Demand For Impending Big Power10 Iron
Awaiting The Power10 Rollout And The New Sales Cycle
IBM Versus GlobalFoundries: A Lawsuit Instead Of The Power Chips Planned
IBM Reveals Power10 Rollout Plan, Begins Power11
IBM’s Possible Designs For Power10 Systems
Drilling Down Into The Power10 Chip Architecture
Power Systems Slump Is Not As Bad As It Looks
The Path Truly Opens To Alternate Power CPUs, But Is It Enough?
IBM Gives A Peek Of The Future At POWERUp 2019
What Open Sourcing Power’s ISA Means For IBM i Shops
IBM’s Plan For Etching Power10 And Later Chips
The Road Ahead For Power Is Paved With Bandwidth
IBM Puts Future Power Chip Stakes In The Ground
Thanks for the quality of this kind of coverage…
Thanks for the good information!
However, are you sure that in terms of “single-threaded performance” the Power10 has a chance against x86 counterparts? From what we see in our environment — and also based on official rPerf number — ST performance is much weaker on Power compared to competitors.
This actually causes sometimes headaches and I would be very happy if this could be successfully addressed with Power11.
Interesting. I think it is generally a stronger thread and with a much higher clock speed. There’s lots of cache and memory bandwidth. Is there something weird about the code?