• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • The NUMA NUMA [Song] Tax

    October 1, 2018 Timothy Prickett Morgan

    When you have a wafer of chips, at least in theory all of the transistors cost the same on the wafer. But sometimes, when transistors perform certain functions, they are worth more. And in some cases, such as the electronics that enable the coupling of multiple cores on a die across a shared L3 cache or that allow the ganging up of processors across multiple sockets that allows for larger and shared main memory for applications to run in, those circuits are worth a lot more.

    This lashing together of compute components across shared memories – Non-Uniform Cache Access, or NUCA for the on-chip coupling and Non-Uniform Memory Access, or NUMA, for the multi-socket coupling – make up a significant portion of the die area and transistor budget, and therefore should represent a significant portion of the overall cost and then the price of a given processor. NUMA predates NUCA, and in a sense, NUCA is just an on-chip version of NUMA. A modern processor is basically a version of a NUMA server, like a 12-socket processor from the RS/6000 S70 system and the AS/400 740-2070 and AS/400 S40-2208 systems from the late 1990s based on the “Northstar” PowerPC chip – remember those? – implemented on a single socket. The big difference is that those machines had 100X less memory and 15X lower clock speeds.

    We get a lot more for our Power Systems money these days, to be sure. But relative to the market at large, Power Systems running IBM i and AIX are still paying a premium compared to Windows Server and Linux systems with equivalent performance and functionality.

    Back then, because IBM published list prices as had been its habit – compelled by law after settlements of several antitrust lawsuits decades earlier – we could actually calculate the incremental cost of the NUMA expansion in the machines. Now as then, customers didn’t just pay extra based on the NUMA chipsets that glued multiple processor sockets into a single system image, but you also paid in ever-higher system software license costs. Moreover, performance did not scale perfectly as NUMA levels increase from 2 to 4 to 8 to 12 sockets; the incremental performance improvement gets smaller as the NUMA cluster gets larger, and this is not something that just affected and continues to affect IBM Power Systems, but all NUMA machines. I will say that the NUMA interconnects have gotten better, faster, and more efficient over time, and more of the aggregate CPU clocks can actually get work done if the compute complexes are not memory or I/O constrained.

    I will give you a for instance from back then to prove a point, and to show that IBM used to be generous. Back in the first quarter of 1999, a single socket Northstar system, the AS/400 730-2065, had a single 262 MHz – yes that is megahertz – processor and it cost $390,000; it was in a P30 software tier and rated at 560 CPWs of performance. That worked out to $697 per CPW. Moving to two processors in the AS/400 730-2066 had two of the same processors and was rated at 1,050 CPWs but cost $650,000, or $619 per CPW. Moving to four of the Northstar processors in the AS/400 730-2067 meant having an aggregate of 2,000 CPWs, and the cost dropped to $515 per CPW with a price tag of $1.03 million; the software group jumped to P40. These were for full-on machines, with both 5250 and client/server workloads able to run full out. (You could get machines with varying levels of crippling on either style of workload, and that changed the economics a lot.) The AS/400 740-2069 put either of these processors in a single system image, which had 3,660 of aggregate CPWs at a cost of $1.44 million, or $393 per CPW, and the software tier rose to P50. And finally the big bad AS/400 740-2070 had a dozen of these chips and was rated at 4,550 CPWs, and at a cost of $1.68 million, that worked out to $369 per CPW for the P50 tier system.

    Obviously, compute was much more constrained, from the point of view of chip design, and IBM wanted to encourage companies to invest in ever larger systems, and so it made the CPWs cheaper as companies wanted more of them. The NUMA tax, therefore, was all in the software tier elevation costs and in the gradually decreasing incremental performance as the NUMA cluster grew.

    Fast forward to today. A Power S914 with a single Power9 chip with four cores – a mere four cores – can crank through 52,500 CPWs, or about 13,125 per core. A single Power9 core has three times the performance of a 12-way NUMA AS/400 740 series server. That is just crazy, but then again, it isn’t. It is just Moore’s Law, going wide on the cores and threads after it is no longer possible to go deep on the clock speeds. (The base clock speed on that Power9 core is about 8.8X higher than the Northstar from days gone by at 2.3 GHz. But it is not 10 GHz or 20 GHz as we thought was possible many years ago.) A single socket “Cumulus” Power9 machine with all 12 cores activated and running at 2.9 GHz would deliver about 70,000 CPWs, we estimate, and 12 of these sockets running at 3.55 GHz would deliver 2.05 million CPWs. We estimate that this machine, configured up, would cost a few million bucks – IBM has not provided pricing of any sort for this machine, but clearly we are down in the range of a buck or two per CPW. Call it a factor of 200X improvement (meaning lowering) in the cost of raw compute.

    But that is not the question. What I want to know, and what I cannot yet figure out, is if the incremental cost of scaling machines across sockets is now higher as the machines scale all the way from the Power S914 through to the Power E980.

    Back in April, I did a price/performance analysis of the “ZZ” Power9 entry systems that can run IBM i, and it showed that depending on the configuration the cost per CPW for the hardware of a base system and its base IBM i license with a reasonable number of users rose a little as the system got more powerful thanks to core scaling within a socket, but got a lot more expensive sometimes as the number of sockets grew. In general, it was the software cost that made this true, and that is because software costs scale up with GDP and inflation (software being created and supported by people) while processor, memory, and storage costs scale with Moore’s Law (getting less expensive over time, cut in half every 36 months in the Power Systems line). Here is a reminder of what data we do have:

    The only thing we have to compare is the Power S914, the Power S922, and the Power S924, all three of which can run IBM i on all of their cores. These were prices with all cores activated to run IBM i and for base hardware with a reasonable amount of memory and disk. As you can see, the cost per CPW for the hardware rises from the Power S914 to the Power S924 a bit, and drops with the Power S922, which is a denser machine with less expandability and therefore has a bit of a price cut on hardware. But the IBM i software costs triple or quadruple, depending on the points you compare, between the one-socket Power S914 and the two-socket Power S924, and this is a kind of NUMA tax. There is no IBM i allowed on the four-socket Power E950, but we suspect that the hardware costs rise here and would the software tier costs if IBM i was allowed. We have no pricing on the E980, but there is no question in our minds that IBM probably charges more per unit of capacity for these machines, which scale up to 16 sockets and 64 TB of main memory, and not just for the systems software but also for the hardware. We can’t prove it, but it is a hunch. We will try to see if this is the case, because it would represent an important reversal in IBM’s pricing strategy.

    Stay tuned, and if you know something, say something.

    [Editor’s note: A heartfelt shout out to Gary Broslma, who lip-synched the Numa Numa Song with such joy and enthusiasm and was the first thing I ever saw on YouTube. You are an inspiration, even after all of these years.]

    RELATED STORIES

    Counting The Cost Of IBM i On Power9 Entry Systems

    Bang For The Buck On Power9 Entry Hardware

    The Performance Impact Of Spectre And Meltdown

    The Deal On Power9 Memory For Entry Servers

    Inside IBM’s Power S924 Power9 Entry System

    Drilling Down Into The New Power9 Entry Servers

    At Long Last, IBM i Finally Gets Power9

    IBM Preps Power9 “ZZ” Systems For Imminent Launch

    IBM Readies Mainstream Power9 Iron For Launch

    The AS/400 Lessons Come Back Around With Power9 Systems

    IBM Deal Prices Current Power8 Compute Like Future Power9

    Advice For The Power Systems Shop That Has To Buy Now

    Power9 Big Iron “Fleetwood/Mack” Rumors

    Talking Power9 With IBM Fellow Brad McCredie

    The Power Neine Conundrum

    IBM i And AIX Won’t Get Power9 Until 2018

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: CPW, Cumulus, E980, IBM i, Linux, Non-Uniform Cache Access, Non-Uniform Memory Access, Northstar, NUCA, NUMA, Power S914, Power S922, Power S924, Power Systems, Power9, Windows

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Can ERP Vendors Deliver On Industry 4.0 Expectations? IBM i PTF Guide, Volume 20, Number 39

    Leave a Reply Cancel reply

TFH Volume: 28 Issue: 65

This Issue Sponsored By

  • Fresche Solutions
  • New Generation Software
  • SEA
  • COMMON
  • Manta Technologies

Table of Contents

  • The NUMA NUMA [Song] Tax
  • Can ERP Vendors Deliver On Industry 4.0 Expectations?
  • Guru: The Binding Directory Is The Key
  • Advice For The IBM i Shop Buying X86 Servers
  • The Herculean Task Of Applying Spectre/Meltdown Patches

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle