Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium
Published: August 26, 2008
by Timothy Prickett Morgan
Since waking up after being asleep for enough years to let Advanced Micro Devices have the driver's seat in the X64 chip market, in terms of design elegance, the top brass at Intel have been espousing their so-called "tick-tock" rhythm of process and microarchitecture developments for Xeon and Itanium server processors. Maybe "left jab-right hook" would be a more accurate description, since Intel is in position to keep pounding on a substantially weakened AMD.
This was obvious from the presentations that Pat Gelsinger, senior vice president and general manager of the company's Digital Enterprise Group, made at Intel Developer Forum in San Francisco last week. While there was not a lot of new information disclosed about future processor roadmaps, Gelsinger was clear that the company is never going to let up on innovation and give AMD a chance to catch its breath again.
Gelsinger and the Intel team in attendance at IDF wanted to talk up the future "Nehalem" Core i7 and Atom desktop, laptop, and embedded processors, which are of course interesting. But at IT Jungle, what we care about is the iron inside the data center, and that means we care about Xeons and Itaniums in the Intel product line. Because Gelsinger didn't want to give the future Xeons all the glory, he talked briefly about the impending "Tukwila" quad-core Itanium processors and the Itanium roadmap a bit before diving into the Xeons, which are the volume server chip for Intel. Tukwila, like the Nehalem Xeons, will make use of the new QuickPath Interconnect, an analog to AMD's HyperTransport interconnect that will replace the front side bus architecture that Xeons have used since Intel got into the server racket in 1992. Gelsinger reiterated that Intel would deliver Tukwila chips to its Itanium partners later this year, and said that the company expected for systems using the chips to appear "in the first part of 2009." No, that sure is not the middle of 2008, which is when Tukwila was originally expected by many, and it is surely later than Intel had hoped to deliver "Tanglewood," which is what the Tukwila project was called before Intel renamed it five years ago.
Tukwila will be implemented in a 65 nanometer process, unlike the 90 nanometer process used for the current "Montecito" and "Montvale" dual-core Itanium 9000 series of chips. As we previously reported, Intel was showing off the Tukwila chip at the IEEE's International Solid State Circuits Conference in February. The Tukwila chip has over 2 billion transistors, three times the transistors of the current "Montvale" dual-core Itanium 9000s, and are 700 square millimeters in size. Each of the four cores on the Tukwila chip has HyperThreading, giving each chip eight instruction threads, and 30 MB of L3 cache memory on the chip, up from 24 MB on the Montvale and predecessor "Montecito" dual-core Itanium 9000 chips. Back in February, Intel said that the QuickPath Interconnect would enable processor-to-processor bandwidth of 96 GB/sec and peak memory bandwidth of 34 GB/sec. The word on the street is that the Tukwila chips are expected to range from 1.2 GHz to 2 GHz in speed and offer about twice the performance of the current 1.66 GHz Montvales--presumably that is the relative performance for the top end parts. That large cache is one of the culprits behind the expected 170 watt thermal design point for the Tukwila.
Looking ahead, Gelsinger said that the Itanium family would see "Poulson" chips, based on a new microarchitecture and using a 32 nanometer process, followed by "Kittson," a chip Intel has said almost nothing about and continues to say nothing about. There is some chatter out there about Poulson coming to market in late 2009 and having more than four cores (possibly six but probably eight), but considering that Tukwila is only getting into machines in early 2009 and the 32 nanometer chip making processes probably won't be fully ramped until 2010, I think you can expect Poulson around then. And maybe Kittson, whatever that is, in 2012.
Before getting into the Nehalem chips, Gelsinger took a brief pitstop in the present and talked a little bit about the six-core "Dunnington" variant of the current "Penryn" Xeons, which started shipping last fall and which are implemented using 45 nanometer processes. The main job of the Dunnington designs is to blunt the attack by AMD in the high-end of the X64 server space with its current "Barcelona" and future "Shanghai" Opteron 8000 series chips. These Dunnington chips will be sold as the Xeon 7400 series and will only be available in four-socket servers--no single- or two-socket machines, no desktops; they can plug into the same "Caneland" server platforms that the quad-core "Tigerton" Xeon 7300s use. The Dunnington chip has 16 MB of L3 cache on the chip.
Gelsinger showed off a TPC-C online transaction processing performance benchmark for an eight-socket IBM System x3950 M2 server using Dunnington chips that broke 1.2 million transactions per minute in performance. That machine was configured with Dunnington chips running at 2.66 GHz with 3 MB of L2 cache per core and the shared 16 MB of L3 cache. By way of comparison, the same x3950 server using the quad-core Xeon X7350 processors running at 2.93 GHz and having 4 MB of L2 cache per core was able to handle 841,809 TPM. So the move to Dunnington, which added 50 percent more cores running slightly slower but added L3 cache yielded 42.6 percent more OLTP performance. Neither the Tigerton not the Dunnington chips have HyperThreading, which could have boosted performance a bit more for both machines.
The Dunnington chips are expected to start shipping in machines in September, and Intel has been providing parts to customers since July.
Now, on to Nehalem and some new nomenclature that Intel is going to use to get away from the old 2P and 4P socket count designations it has used for many years. So, we are going to have the Nehalem EP, short for "expandable performance," in two-socket servers and the Nehalem EX, short for "expandable," in four-socket and larger servers. (The 2P variant of Nehalem has the code-name "Gainestown," while the 4P variant is called "Beckton" internally.) People are saying that the 2P boxes will only support four cores, while the 4P boxes will get eight cores, but Intel has not confirmed this publicly at IDF. On the desktop and in the laptop, Intel has already said that it will call the Nehalem chips by the Core i7 brand, and whatever "i7" means, Intel is not saying. (Probably short for seventh iteration is my guess.) The Core i7 and Nehalem EP chips are coming to market first, with the Core i7 chip expected in the fourth quarter of 2008. (Gelsinger did not say when the Nehalem EP server chips would ready, so it looks like early 2009.) These will be followed by the Nehalem-EX for big servers and additional desktop and laptop Nehalem processors (code-named Havendale and Lynnfield on desktops and Auburndale and Clarksfield on desktops) in the second half of 2009.
Anyway, at IDF last week, Gelsinger showed off the first silicon of the Nehalem EX chip, which is an eight-core processor with all the cores on a single die. (This ain't no quasi-octo-core chip.) And the first feature Gelsinger talked about was a power gating technology that will allow the chip to turn off unused cores in the processor complex to save power and to boost the clock speed on the cores where work is being done to get that work done more quickly. This is called Turbo Mode.
Gelsinger said that the Nehalem design would come with 2, 4, or 8 cores and would have an integrated three-channel DDR3 main memory controller on the chip. The cores on the chip sport a new generation of simultaneous multithreading (Intel is not calling it HyperThreading-2), SSE4.2 instructions, the QuickPath Interconnect. The Nehalem chip will allow up to 288 GB of main memory in a two-socket box (that's 18 memory slots), according to Gelsinger, and he showed a Nehalem system offering 3.4 times the performance of a current "Harpertown" Xeon DP chip in the Penryn family. These machines were running the Stream supercomputer benchmark, which is a good test for memory bandwidth. The Harpertown system used a 3 GHz processor with a 1.6 GHz front side bus and 800 MHz DDR2 main memory, while the Nehalem box had a 6.4 GB/sec QPI bus links (up to 25.6 GB/sec of aggregate bandwidth) and 1 GHz DDR3 main memory. The Core i7 desktop chip has 8 MB of shared L3 cache for the cores, but Intel has not said how much cache will be in the EP and EX server variants, which presumably will be called Xeons. Each Nehalem core is expected to have 32 KB of L1 instruction cache, 32 KB of L1 data cache, and 256 KB of L2 cache.
The kicker to Nehalem is "Westmere," which will be implemented in a 32 nanometer process and have some minor architecture tweaks and possibly 4 core and 6 core variants. After that comes "Sandy Bridge," which sports yet another new microarchitecture and which will be implemented in 32 nanometer processes. A few Westmere chips are expected to come to market in 2009, followed by a complete ramp in 2010. It seems likely that Sandy Bridge will debut in late 2010 and start ramping through 2011.
Intel's Nehalems to Star at IDF, AMD Pitches Shanghai
AMD Revises Opteron Roadmaps, Pushes Out Rev Gs
Server Makers Start Shipping Barcelona Boxes
AMD to Slash 10 Percent of Workforce Amid Sales Shortfall
Intel Talks Up X64, Itanium Roadmaps Ahead of IDF
AMD Says Barcelona Bug Is Fixed, Almost Ready to Ramp
AMD Stalled by a Bug in Barcelona Opterons
Intel Announces First "Penryn" Xeon Processors
AMD Gets Aggressive About Watts with Quad-Core Barcelonas
Chief Marketeer at AMD Quits Before Barcelona Launch
AMD's Chip Roadmaps: Beyond Barcelona
Intel Cranks Out Two More Quads, AMD Sets Barcelona Date
AMD Gooses Dual-Core Opteron Speeds, Cuts Prices
Intel Sets Up 'Tigerton' Xeon MPs Against Future Opterons
AMD Sets 'Barcelona' Quad-Core Opteron Launch for August
Intel Details Future 45 Nanometer Chip Plans from Beijing
Intel Shows Off Future Penryn and Nehalem Chip Designs
Intel Delivers Low-Power, Quad-Core Xeon Chips
AMD: Native Quad Core Opteron Will Best Intel Quasi Quads
Intel Delivers More Quad-Core Server and PC Chips
AMD Unveils Rev F Opterons, Prepares for Quad Cores in Mid-2007
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot