PA Semi Divulges Its Power Processor Aspirations
by Timothy Prickett Morgan
If you are going to talk up the idea of the Power processor as an open standard and as the only viable, long-term alternative to the X64 architecture being created together (through competition but certainly not cooperation) by Intel and Advanced Micro Devices, you have to put your mouth where you money is, and vice versa. And to its credit, IBM has done that by licensing the Power architecture to startup chip maker PA Semi, which will soon be competing with Big Blue in the Power chip business.
When I was contacted by the public relations firm that is managing the PA Semi launch this week, I immediately thought of Hewlett-Packard's former Precision Architecture RISC processors. Since I am always looking for wicked irony, I fervently wished that PA Semi would turn out to be a company that licensed the PA-RISC architecture and created new chips based on it. But, alas, it was not meant to be, and PA Semi is focused instead on Power Architecture.
IBM launched its Power.org community to promote the ecosystem of the PowerPC and Power processors in December 2004, and back then the move was seen as a largely self-serving one. The original other PowerPC development partner, Motorola, was spinning off its chip unit and even though its Freescale unit was still making and developing Power chips, it did not join. And Apple, the biggest PowerPC chip consumer from the original PowerPC Consortium that was formed in 1991, was, in retrospect, planning on being traitorous to the Power chips as well. A few months after Power.org was launched and Apple was not part of it, there was rumors of difficulties between IBM and Apple over Power, and by June 2005 we all knew that Apple had actually decided to jump to Intel X64 processors. That left IBM, a bunch of game console partners consuming its Power-derived chips, a lot of makers of embedded devices, the standard Linux and real-time operating system makers, a few chip manufacturing facilities, and some very niche application software providers that like the Power architecture, as the forces behind the Power.org community. The advent of PA Semi shows that IBM is dead serious about the Power architecture succeeding in the market--even if it means someone else might be doing the succeeding some of the time.
PA Semi was established in 2003 in Santa Clara, Calif., by some of the biggest names in processor design. Dan Dobberpuhl, who was the lead chip designer on Digital Equipment's Alpha line of RISC processors--the first 64-bit RISC chips in the world and arguably still one of the most elegant designs the world has ever seen--is PA Semi's president and CEO. He was also a designer on the StrongARM line of power-efficient processors from DEC, and with that skill went on to found a company called SiByte with Jim Keller, also a DEC Alpha chip designer and the co-author of the "K8" or Opteron 64-bit architecture from AMD. SiByte, which made communications processors, was eventually snapped up by chip maker Broadcom, which left Dobberpuhl and Keller sitting around twiddling their thumbs. But not for long. They formed PA Semi, with Keller as vice president of engineering, and tapped Pete Bannon, who worked on the Alpha EV5 and EV7 chips and then moved over to Intel to help with the future "Tukwila" four-core Itaniums that are due in 2008 or so, to come on board as vice president of architecture.
So why did a bunch of guys who were hooked into Alpha, Itanium, and Opteron decide to clone the Power chips? The answer is simple: money and the lack of intense competition. According to estimates made by chip market watcher Linley Group and PA Semi, the entire Power and PowerPC chip market was worth about $1.4 billion in 2004 (including sales of all Power chips for all manner of devices), and it is expected to grow to $3.8 billion (even without Apple) by 2007. They reckon that the sale of embedded processors is worth about $5 billion in chip sales a year today, and high-performance computing accounts for another $2 billion in chips. With the right stuff, Power chips could take a run at more of that money. After all, if the Alpha chips were elegant, so too have been IBM's Star line of PowerPC chips, its Power4s, and its Power5s.
Dobberpuhl and his colleagues looked at all the still-vibrant architectures--X64, Sparc, Power, and Itanium. "Our thing is going to be high performance and lower power," explains Dobberpuhl. "There are only three suppliers of Power processors--IBM, Motorola, and AMCC, and competing directly against Intel doesn't make very good sense. Because of the heft of IBM, we believe that PowerPC is a keeper." So, PA Semi put their heads together, raised some $36 million in Bessemer Venture Partners, Silicon Valley Bank, Venrock Associates, licensed the PowerPC spec version 2.04 (which has more features than the Power5 chip, apparently), and relatively quickly and silently amassed 150 employees who are now working on its first chips, called the PWRficient chips.
The PWRficient chips will scale from one to eight PowerPC cores, and while they will implement the full PowerPC instruction set, they will not implement the PowerPC AS instructions that IBM puts in its own Star, Power4, and Power5 chips to support the single-level storage of the iSeries-OS/400 architecture. So these chips will be able to run Linux and AIX, but not OS/400. The PWRficients will have from one to four integrated DDR2 main memory controllers on chip, as well as 64 KB data and instruction caches on each core and on-chip, shared L2 caches that are configured as arrays of up to 8 MB. Unlike IBM's Power designs, the PWRficient chips will implement the 128-bit VMX vector processor created by Motorola for its G series of 64-bit processors as well as a single floating point unit and a single integer unit. Dobberpuhl says that while the PowerPC architecture implements two floating point units, the PWRficient chips will have only one FP unit but will offer twice the performance. That's a pretty heavy claim. Each Power core in the PWRficient chip can execute four instructions per cycle. The chip includes an intelligent I/O bridge called Envoi and an interconnection fabric on the chip called Conexium that links the cores, L2 cache, main memory, and integrated I/O bridges together. The electronics that support IBM's Virtualization Engine hypervisor are also in the PowerPC spec, and they are consequently implemented in the PWRficient silicon. The chip also includes accelerators for RAID and iSCSI protocols as well as for PCI Express I/O and TCP/IP-Ethernet workloads. Like other PowerPC processors, it supports both current 64-bit and legacy 32-bit mode operations. By the way, the Conexium crossbar can be extended to support up to eight cores, four 2 MB L2 caches, and four DDR2 main memory controllers.
You night be wondering how all of this stuff is getting crammed onto a chip. Because the chips are made in a 65 nanometer process, you can get a lot of components on the chip. PA Semi has chosen a lab to make its chips, but is not prepared to announce who it is yet, by the way.
The company is working on two chips right now. The PA6T-1361E has a single Power core running at 1.5 GHz, the 64 KB data and instruction caches, 1 MB of L2 cache, and the integrated DDR2 main memory controller. The I/O subsystem implemented on the chip can support four PCI Express engines and four Gigabit Ethernet engines, which deal with the processing associated with these peripherals. The PA-6T 1682M has two Power cores running at 2 GHz, two DDR2 memory controllers, 2 MB of shared L2 cache, plus the Conexium interconnect and the Envoi bus architecture. This one also has hardware support for 10 Gigabit Ethernet on the chip, which is pretty aggressive and which allows the glueless creation of 10 GigE clusters (think about that for a second, let it sink in). This chip will have about the same performance as the dual-core PowerPC 970, says Dobberpuhl. But it will burn a lot less electricity and emit a lot less heat.
How much less? A single-core PWRficient chip running at 1 GHz it only burns 2 watts of juice, and running at 2 GHz it only burns 13 watts. Cranking it up to the top speed of 2.5 GHz will only push the energy consumed and emitted up to 22 watts. Think about that, too. Mull it over. Most X64 chips are in the 85 to 150 watts range. And to bring it on home for Apple, check this out. Intel's future "Sossaman" dual-core chip will run at around 2.5 GHz, have a memory latency of 120 nanoseconds, emit about 73 watts, and occupy about 2245 square millimeters of space. The PWRficient 1682M will also be a dual-core chip, but it will run at 2 GHz, have a memory latency of 55 nanoseconds, run at 25 watts, and occupy only 1225 square millimeters. The Sossaman chip will have about twice the integer performance and about half the floating point performance of the PWRficient 1682M. But the cost in watts is pretty high.
While these PWRficient chips are pretty cool, it is going to take some time to bring them to market. The dual-core PWRficient chip is aimed at the HPC server market and will sample in the third quarter of 2006 and start shipping in early 2007, while the single-core chip aimed at the high-volume embedded market will start sampling in early 2007 and presumably ship a little later. The four-core PWRficient chip is due at the end of 2007, while the eight-core chip is due at the beginning of 2008. PA Semi already has a test chip, code-named "Virgo," running in the labs and implemented in the 65 nanometer process.