The Performance Impact Of Spectre And Meltdown
March 12, 2018 Timothy Prickett Morgan
We have been waiting to see what impact on performance the Spectre and Meltdown speculative execution patches, which plug some security vulnerability holes that search engine giant Google discovered last summer and made public in early January, would have on Power Systems iron running the IBM i operating system.
Now that Big Blue has published the first edition of the Power Systems Performance Report that includes the new “ZZ” Power 9-based systems, we not only get a sense of the relative performance of the “Nimbus” Power9 chip for entry servers. We also can figure out the performance impact of the Spectre and Meltdown patches as companies apply them. With some caveats.
Let’s start with the good news. Based on some very early renditions of the Spectre and Meltdown patches that were applied to Red Hat Enterprise Linux 7 during the first week that the vulnerabilities were exposed to the general public, we had been expecting that the performance impact on systems running online transaction processing (OLTP) workloads and heavily virtualized environments – like those commonly run on IBM i platforms – could take anywhere from an 8 percent to 19 percent performance hit. Google and Red Hat cautioned that these results were based on so-called microbenchmarks, meaning they were tests that did not necessarily stress the system in precisely the same way as real-world applications and that extrapolating from these tests to your own code was not precisely valid. But, it was all we had at the time to make some contingency plans for dealing with the impact from the Spectre and Meltdown patches. You do what you can with what you got.
We are an empirical lot here at IT Jungle, so we took the March 2018 edition of the performance report and lined it up against the February 2017 version of the document and did some before-after analysis. IBM did not provide any documentation specifically addressing the performance impacts of the Spectre and Meltdown patches, so we had to do the before-after math on every Power8 system in the lineup to see what the impact was, based on IBM’s own Commercial Workload Performance (CPW) test. It didn’t take long for us to see a pattern. Here is what it looks like for entry machines:
We worked backward and estimated what the performance of the Power9 ZZ machines would be if they did not have the Spectre and Meltdown patches applied. And here is what it looks like for bigger NUMA machines based on Power8 processors:
As you can see, the performance hit for the Spectre and Meltdown patches is 5.2 percent of peak raw CPW capacity, except for the Power S812 with a single Power8 cores, which is 5.3 percent. This is a hell of a lot better than taking an 8 percent to 19 percent performance hit.
A few thoughts. First of all, I am immediately suspicious of anything that is this mathematically clean, and although I do not know it, I suspect that IBM may have tested a few machines, got consistent results on them, and then applied an algorithm to the old CPW data to come up with the new post-Spectre/Meltdown CPW data. I don’t have a problem with that so long as it reflects reality, and the whole point of the CPW tests is that IBM is standing behind the relative performance of different processors in the Power Systems line by the very act of publishing these results. IBM never guarantees the absolute performance of any system, although for machines running AIX and Linux Big Blue does have a guarantee that they can be run at a higher sustained utilization than an X86 server of roughly the same raw performance.
Here is another thing to consider. The Spectre/Meltdown patch performance impact on Java virtual machines and database analytics and decision support systems had a more moderate performance hit after the fixes – on the order of 3 percent to 7 percent – in the Red Hat tests, and that is because this software often aggregates requests between the kernel and user spaces instead of streaming them out individually. (Every time you cross the kernel-user space boundary, that is where the patches slow things down). There was an even smaller impact, on the order of 2 percent to 5 percent in the Red Hat tests, for more generic kinds of raw calculations, and any function that bypasses the kernel is going to be unaffected by the patches. It is hard to say of the impact on these workloads will scale back like it did for the database and virtualization stack, but it could. And then again, maybe it could not.
The point is, just as we said before, you need to benchmark your systems before you apply the Spectre and Meltdown patches, on a variety of workloads that are commonly running on your production machines, and then retest them after the patches are applied. You may get results that are very different from the ones suggested by the old and new CPW tests. The point is to know for sure, and then if you do not get results that are consistent with IBM’s CPW tests, you know that you have a problem and can solicit help from the experts at your business partner and IBM. Do this work at the front end, rather than wishing that you had done it after something goes more wrong than expected. Because, as you know, sometimes things go wronger than planned.
Next week we will get into the performance of the Power9 ZZ systems and start figuring out how they stack up to Power7, Power7+, and Power8 iron in the same entry class.