• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Power8 Packs More Punch Than Expected

    August 18, 2014 Timothy Prickett Morgan

    Here is something you don’t see every day in the systems business. IBM is getting even better performance out of the new Power8 processors that were launched back in April than it anticipated. Systems performance engineer Alex Mericas, who works in IBM’s Systems and Technology Group, gave a presentation at the Hot Chips 26 conference in Silicon Valley last week, revealing that that the Power8 was delivering a little more oomph than expected.

    As The Four Hundred previously reported, IBM revealed a lot of the details around the Power8 chip about a year ago at the Hot Chips 25 conference, showing off the core count and memory and I/O bandwidth of its 12-core Power engine.

    We didn’t know this at the time, but what IBM was talking about then was a monolithic 12-core Power8 chip that it now refers to as the Power8 Scale Up version. This one, Mericas explained, has all of the cores on the same die, plus large memory capacity of up to 1 TB per socket and very high memory bandwidth to go with it. The Power8 Scale Up also sports 32 PCI-Express 3.0 links, with 16 of them being able to be configured as Coherent Accelerator Processor Interface (CAPI) ports to hook into accelerator co-processors based on GPUs, DSPs, FPGAs, and anything else you might want to put on a PCI card and hook into the virtual memory of the Power8 complex.

    The chips that actually shipped in the initial Power8-based machines are what IBM calls the Power8 Scale-Out processors, and in this case, IBM actually chops a Power8 chip in two (presumably because half of it has something messed up on the die) with six cores on each half and the full complex of local SMP links and accelerators on each half instead of shared across all twelve cores on the monolithic die. The Scale-Out chip is a little wider than the Scale-Up version because of this. It also has 48 PCI-Express 3.0 slots, with up to 32 of these being configurable as CAPI ports.

    Mericas took performance data from last year’s Hot Chips, which showed the expected performance of the Power8 Scale-Up chip that was in development at the time, and added to it the actual measured performance of the Power8 Scale-Out on various workloads. The estimates were made comparing a Power7+ and a Power8 at the same 4 GHz baseline clock speed. For the real-world tests, IBM pitted a two-socket Power 740+ server with two eight-core Power7+ processors against a two socket Power S824 machine based on two six-core Scale-Out multichip modules. In the real-world, that Power7+ chip runs at 4.2 GHz and the Power8 chip runs at 3.52 GHz, and it only makes sense if the chips are normalized to a clock speed. There is a 3.6 GHz version of the Power 740+, and it seems likely that IBM clocked them both at 3.6 GHz or overclocked the Power8 up to 4 GHz and slowed down the Power7+ to 4 GHz to make the comparisons fair. Here is the result:

    As you can see, the memory bandwidth as delivered by the initial Power8 systems is a bit higher than expected (about 12 percent in my estimate of the chart above), and so is the Java performance (about 15 percent). Floating point math operations are about what Big Blue expected they would be, but integer performance is a little bit lower (it looks to be around 10 percent lower) compared to expectations. The commercial performance–which is very likely based on the TPC-C or SAP SD transaction processing tests, but IBM does not say–also came in a bit shy of expectations, about 8 percent by my eye.

    Such wiggling between theoretical and modeled performance and actual performance can be attributed to a few factors, and one of them might have to do with the difference between a monolithic chip and two half chips sharing the same package. Perhaps more significantly, the fact that IBM can reckon the performance of such workloads on in its models and come even close to the numbers, with a variant of 8 to 15 percent up or down, is pretty remarkable.

    IBM did not talk about how the performance of these various workloads would vary based on the operating system, and presumably these are all performance metrics based on IBM’s AIX Unix variant running atop the processor. Generally speaking, for a lot of workloads, there is not a significant performance difference between the three on Power iron for a certain level of scalability, but for across multiple processors, AIX and Linux scale better than does IBM i.

    In addition to talking about actual versus expected performance on Power8 chips relative to a Power7+ baseline, Mericas also talked a bit about the batch performance of the Power8 processor. And yes, in case you missed it, batch is back in vogue. Well, to be fair, it never really left, as IBM i and System z mainframe shops know full well. But what a lot of people don’t realize is that Hadoop and its MapReduce protocol, so commonly used for big data analytics these days, is a batch processing system. And as IBM tries to position Power8 clusters as a perfect place to run modern workloads like Hadoop, batch performance therefore matters a great deal.

    To stress test the Power8 under batch processing conditions, IBM grabbed an internal benchmark that emulates batch tasks performing compression, and under conditions where the response time for the transactions is important. IBM compared a Power7+ core to a Power8 core with one thread dedicated to the task and no other threads on a core doing any other work. In another set of runs, IBM turns on the maximum number of threads for each core (four for the Power7+ and eight for the Power8) and runs the batch job again.

    For single-threaded batch work on this unnamed workload, the Power8 core delivered 2.3X the throughput on the batch work and a 56 percent lower response time compared to the Power7+ core. (These tests were done on two sixteen-core systems, presumably at the same clock speeds or with the data normalized for clock speeds.) If you turn on simultaneous multithreading (SMT) on the Power7+ core, the Power8 with no threading has an 82 percent lower response time on the batch work and delivers 1.4X more throughput. And if you compare the Power7+ with SMT4 (which means four threads of SMT) to Power8 with SMT8 (eight threads per core), then the Power8 core has a 31 percent lower response time and delivers 2.9X the throughput of the Power7+ core.

    The lesson here is that performance, for either online and batch work, depends on how threaded your software is and you can tune the system to get a desired mix of throughput and response time based on how many threads per core you activate. You are in control of those knobs, and SMT can be dynamically allocated as conditions change. This is a far cry from the old days when you added a few disk arms or a few sticks of memory to try to rejigger throughput and response time, or worse yet, did a processor upgrade based on some generic relative performance metric like RAMP-C or CPW and then did not see the expected performance gains after spending the money. IBM i shops have to know how their apps are threaded and set goals, and then see if they can tune the hardware threading to get the results they need for both batch and online work.

    RELATED STORIES

    IBM Readies More Power8 Iron For Launch

    Counting The Cost Of Power8 Systems

    Four-Core Power8 Box For Entry IBM i Shops Ships Early

    Thanks For The Cheaper, Faster Memories

    Threading The Needle Of Power8 Performance

    Lining Up Power7+ Versus Power8 Machines With IBM i

    IBM i Shops Pay The Power8 Hardware Premium

    As The World Turns: Investments In IBM i

    Doing The Two-Step To Get To Power8

    What’s New in IBM i 7.2–At a Glance

    IBM i 7.2 Available May 2

    IBM i Runs On Two Of Five New Power8 Machines

    IBM i TR8, Database Driven

    Big Blue Launches IBM i 7.1 TR8 As 7.2 Looms

    Big Blue Talks About IBM i And PureSystems

    Power8 Launch Rumored To Start At The Low End

    Rumors Say Power8 Systems Debut Sooner Rather Than Later

    Power Systems Coming To The SoftLayer Cloud

    Intel’s Xeon E7 Brings The Fight To IBM’s Power8

    IBM Pushes Performance Up, Energy Down With Power8

    IBM Licenses Power8 Chips To Chinese Startup

    What The System x Selloff Means To IBM i Shops

    Power Systems Sales Power Down In The Fourth Quarter

    New Year’s High Def, Most Def

    All Your IBM i Base Are Belong To Us

    IBM i Installed Base Dominated By Vintage Iron

    Big Blue Gives A Solid Installed Base Number For IBM i

    Reader Feedback On Big Blue Gives A Solid Installed Base Number

    Power8 Offers Big Blue And IBM i A Clean Slate

    Power8 And The Potential Oomph In Midrange And Big Boxes

    IBM Aims NextScale Hyperscale Boxes At Clouds–And Possibly Power8

    Power8 Processor Packs A Twelve-Core Punch–And Then Some

    IBM To Divulge Power8 Processor Secrets At Hot Chips

    IBM Forms OpenPower Consortium, Breathes New Life Into Power



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Admin Alert: More On Porting User Profiles Between IBM i Partitions VAI Gives Berk Enterprises a New Analytic View

    Leave a Reply Cancel reply

Volume 24, Number 27 -- August 18, 2014
THIS ISSUE SPONSORED BY:

Profound Logic Software
ARCAD Software
System i Developer
Manta Technologies
WorksRight Software

Table of Contents

  • Power8 Packs More Punch Than Expected
  • ManH Dives Into ‘Clienteling’ with GlobalBay Buy
  • Starving For IBM i Security Skills
  • Mad Dog 21/21: On Whom IBM Now Depends
  • Coming Face To Face With An IBM i Recruit
  • IBM HyperSwap And Vision Solutions: Another View
  • Agilysys in Transition; Revenue Falls Short of Goal
  • Dell, HP Chase Upgrades From Windows Server 2003; Whither IBM?
  • IBM Bolsters Security Wares With Lighthouse, Crossideas Acquisitions
  • Manta Continues To Take The IBM i To School

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Public Preview For Watson Code Assistant for i Available Soon
  • COMMON Youth Movement Continues at POWERUp 2025
  • IBM Preserves Memory Investments Across Power10 And Power11
  • Eradani Uses AI For New EDI And API Service
  • Picking Apart IBM’s $150 Billion In US Manufacturing And R&D
  • FAX/400 And CICS For i Are Dead. What Will IBM Kill Next?
  • Fresche Overhauls X-Analysis With Web UI, AI Smarts
  • Is It Time To Add The Rust Programming Language To IBM i?
  • Is IBM Going To Raise Prices On Power10 Expert Care?
  • IBM i PTF Guide, Volume 27, Number 20

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle