The X Factor: One Socket to Rule Them All
February 5, 2007 Timothy Prickett Morgan
Vendors in the information technology business talk about standards more than just about anything else, something that the advent of the open systems business and Unix taught them to do two decades ago. But from their point of view, standards are a terrible thing–unless you happen to own one. We have standards for operating system features–POSIX and TCP/IP being two big ones–and standards for networking–Ethernet of ever-increasing speed–for storage–PCI, PCI-X and now PCI-Express peripheral slots and SCSI, SATA, and SAS disks–and for main memory–FB-DIMM and DDR2 modules. And now, perhaps, is the time for a standard CPU socket and interconnection scheme.
This probably sounds crazy, at least at first. For the history of the system, then the PC, and then server, a processor has been more or less defined by the proprietary socket that it uses to link the CPU to its cache memories (at least until they were moved on chip), its main memory, its I/O channels, and, in multiprocessor systems, to other processors in the complex. Developing a CPU meant developing a unique socket for it, which gives chip makers a large degree of control. And from control comes profits.
Companies in the IT business have two ways to get profits. One is to invent unique hardware or software technology, based on substantial research and development efforts and providing incremental advantages over other–and thank heaven mostly incompatible–technologies. The other way to get profits is to establish a proprietary technology as the de facto standard in the industry. The former is how new technologies are introduced into a market, and the latter is what keeps new technologies from entering a market and therefore allowing established, volume players to take over and establish what are, for all intents and purposes, monopolies.
Examples of the former include the creation of the minicomputer by Digital Equipment in the late 1970s, the introduction of CISC and then RISC processors by Sun Microsystems and Hewlett Packard to create the commercial Unix market in the 1980s, and the advent of respectable server processors from Intel back in the early 1990s. Examples of the latter include the gradual takeover of the Intel-Microsoft duopoly first on the desktop with the X86-Windows combo in the late 1980s and early 1990s and then the adjacent battle for market share that Intel and Microsoft have waged in the data center in the late 1990s and early 2000s, which put Xeon processors and Windows Server on the vast majority of machines, now representing the majority of server revenues each quarter. You get market share by being good, and you keep and extend market share by using volume economics against anyone who might take you on. These forces at work in the IT world are very pure examples of the principles of survival of the fittest and might makes right.
However, the best technologies are often, through no fault of their own, not the fittest, and might does not always make right. Witness the advent of Linux and the Opteron processors from Advanced Micro Devices, each of which is taking on a half of the Wintel duopoly and each of which has taken roughly a quarter of the operating system and chip halves of this hybrid market in the past five years. This is remarkable, given the power and money that both Microsoft and Intel hold in the server markets.
A standard CPU socket would, of course, lower the profit potential inside systems. There is no question about that, just as industry standards in memory, peripherals, and software have similarly lowered profits. Way back when, in the dawn of the computer age and during the heydays of the mainframe and the minicomputer, a vendor made the entire hardware and software stack–which was very expensive and which was also in a relatively low volume compared to today’s server market. But, little by little, these components were standardized. The CPU socket is merely the last thing to go. And eventually, it will.
Or, more precisely, the CPU socket will become standardized unless vendors try to extend the life of the proprietary chip socket by integrating more and more components of the motherboard onto the server chip, effectively making the socket a whole computer. System on a chip efforts are not just about bringing more components onto the chip; they are about keeping things proprietary enough to give vendors control as well as to give customers more integrated, faster computers.
Nonetheless, the CPU socket standard might just be coalescing from the cosmic ether, ever so quietly. Intel, for instance, had hoped to get its Common System Interface out the door this year, but the effort has been pushed out further into the future–perhaps indefinitely, perhaps not. The CSI effort, which was announced several years ago, had hoped to get the Xeon and Itanium processors into the same CPU socket and using the same CPU interconnection scheme, thereby leveling the field between these two very different CPU architectures. CSI was delayed, most likely because Advanced Micro Devices’ Opteron took off and Intel had to scrap all of the work on CSI it had done with its Xeon processors so it could recreate Xeons using the cores at the heart of its mobile processors–what we now know as the Core microarchitecture. CSI is expected to make its debut in 2008 with the quad-core “Tukwila” Itanium processor, a chip that will have tweaked “Montecito” cores, on-chip and shared L3 caches, dedicated L1 and L2 caches in each core, an on-chip FB-DIMM memory controller, and high-speed point-to-point links between cores that will allow the glueless creation of NUMA-style single system images that most likely will span up to 32 cores and possibly as large as 128 cores.
It doesn’t take a genius to figure out that Intel’s CSI looks very much like AMD’s HyperTransport interconnect, but then again, HyperTransport looks an awful lot like the NUMA clustering at the heart of the former Digital Equipment’s high-end AlphaServer line from the late 1990s. And the reason this is true is that AMD saw that Digital’s approach provided for massive scalability and performance, and that is why Intel bought the intellectual property embodied in the Alpha machines and hired Digital’s engineers, who have been creating CSI and the Tukwila chip.
AMD, of course, has its eye on a market broader than the X64 market with its “Torrenza” open socket effort, which was announced last year and which could turn out to be the nucleus around which a CPU socket standard forms. Torrenza was created to foster innovation in the Opteron processor socket, and AMD wants companies to create co-processors, math units, and other kinds of electronics that plug right into the Opteron socket and augment the processing of the CPU within a single system.
Even before Torrenza was launched, supercomputer maker Cray, which hand-crafted the “Red Storm” Linux-Opteron supercomputer for the United States government’s Department of Energy, figured out on its own that if it recast its MTA-2 massively threaded processor (which is suspected of having the National Security Agency as its main customer) to fit into the Opteron socket, it could drop it into the Red Storm machine and create a very powerful machine capable of sifting through massive amounts of information. That machine, the XMT using the new “ThreadStorm” MTA-3 processor, was launched late last year, and over time, Cray is expected to converge its supercomputers to use Opteron sockets across its four processor types–AMD’s Opterons, field programmable gate arrays, MTA-2s, and its own multistreaming vector processors, or MSPs. The Red Storm design takes the HyperTransport interconnect and creates a 3D mesh interconnect, allowing a massively clustering of memory and processors.
There are similarly rumors that IBM will wise up and put the future Power7 processors into Opteron sockets, and Sun, which has done a lot of engineering to bring its “Galaxy” Opteron-based servers to market, has hopefully thought ahead far enough that future “Niagara” and “Rock” Sparc processors will plug into future Opteron sockets and use HyperTransport interconnect. Both vendors have licensed HyperTransport technology.
Getting Intel and AMD to agree to converge the CSI socket and interconnect with the Opteron socket and HyperTransport interconnect would be problematic. But stranger things have happened because of customer pressure and the desire of IT vendors to wring costs out of their machines.
In the end, customers would probably want hybrid servers that support Power, Sparc, X64, mainframe, or any other kind of CPU, co-processor, FPGA, or adjunct electronics–basically because it would allow them to use the same memory, disks, and peripherals, regardless of process architecture (so long as the operating system had the right drivers, of course). This is a better way to build a server, and it could open up a whole lot of interesting application mixing if sockets could be hard partitioned based on their processor architecture and clustered across common CPU architectures within the same system. This would be the ultimate in protecting customer investments–something that all server makers give lip service to. It is time for customers to make them put their money where their lip service is.