IBM Makes the Case for Power Systems SSDs
June 1, 2009 Timothy Prickett Morgan
Thank the explosion in the use of digital cameras and send a condolences card to Polaroid, the maker of instant cameras. One of the most useful technologies to come along in years for servers is flash-based solid state disk drives, or SSDs. While SRAM and DRAM versions of SSDs have been around for ages, they were so awfully expensive that only the most serious workloads could ever have them and they never even came close to going mainstream. But now SSDs are going mainstream, and they are going to change the way servers are configured.
As I detailed in the wake of the April 28 Power Systems announcements, the first SSDs from IBM for the Power Systems servers are definitely going to find some use at i, AIX, and Linux shops, but the units are pricey and until prices come down, customers are going to use them pretty sparingly. This is no doubt exactly what IBM wants customers to do, because given the high rates of I/Os operations per second (IOPS) that SSDs can deliver, companies going overboard with SSDs will saturate their RAID disk controllers, succeeding only in moving a bottleneck from disk drives to controllers.
If you missed the feeds and speeds of the 2.5-inch SSD IBM is selling for Power Systems, let me refresh your memory before we get into some use cases and their economics. IBM is getting a 128 GB SSD and formatting it down to 69.7 GB; the unused capacity in the flash cells is reserved so data can be moved around to keep the flash from wearing out, a process called “wear leveling,” which extends the life of the SSD. (I didn’t know this back in May, but IBM is using the Zeus-IOPS flash drive from STEC, which sells SSDs with SAS, SCSI, Fibre Channel, and SATA interfaces.)
The SSD IBM is reselling for Power Systems uses a 3 Gb/sec SAS interface, and it is hot-pluggable into 2.5-inch SAS slots or into 3.5-inch SAS slots using a special case. This SSD has about 220 MB/sec of sustained throughput on reads and about 122 MB/sec of sustained throughput on writes, according to IBM documentation I have seen, and can handle about 28,000 IOPS of random transactional processing. The unit has an average access time for data that ranges from 20 to 120 microseconds, depending on where the data is physically located on the SSD. IBM says that the SSD consumes about one-fifth of the power of a 15K RPM disk drive of equivalent capacity.
For Power 520 and Power 550 machines, an SSD costs $10,000, while for Power 560 and Power 570 machines, it costs $13,235. (The price difference seems completely arbitrary to me.) The Power 595 does not yet have support for SSDs, but the external DS series of disk arrays from IBM that are commonly used at big Power shops have SSDs inside already. By comparison, a 139 GB (for i) or 146 GB (for AIX or Linux) SAS hard disk runs $498; the 282 GB/300 GB SAS disks run to $1,150 a pop. (This is after recent price cuts, mind you.)
Now let’s talk about some benchmarks. As I said a month ago, in one benchmark test I have seen from IBM documentation, IBM shows that to process 135,000 IOPS, a collection of SSDs (it looks like five of the units) would burn about 300 watts of juice, while a collection of hard disk drives (something like 400 to 500 drives, perhaps) would consume about 8,300 watts. (Yes, this was probably watt-hours, but no one says that when talking about juice until they want to see how much money power costs.) Clearly, there are some big power and noise savings that come from switching to SSDs.
I have been scratching my head about the two SAP business intelligence benchmarks that IBM ran on Power 550 servers and that I told you about three weeks ago. After getting my hands on some new IBM documents, I now understand what happened. The Power 550 that was able to handle 90,492 query navigation steps per hour on SAP’s BI Mix Load business intelligence benchmark was equipped with 96 of IBM’s 15K RPM SAS drives, while the configuration that was able to process 90,635 query navigation steps per hour was configured with 22 SSDs. (The SAP benchmarks do not talk about anything but processor and main memory in describing the hardware, which is stupid.) That’s a big reduction in the number of disks, which means few I/O expansion drawers and cables and other things that cost money. (But it does not mean fewer disk controllers, since each controller can only handle so many IOPS before it gets saturated.)
IBM didn’t do the economics on this, but let’s see how the disks alone price out for this SAP benchmark test. Let me give you a hint: the SSDs get clobbered on price. At $498 a pop, the 15K disk drives run up a bill of $47,808 at list price, while the SSDs cost a whopping $220,000. That is a factor of 4.6 increase in price for a 75 percent reduction in space and maybe a 60 percent reduction in power consumption for the drives. Clearly, this is not worth the money. Which is why I said SSDs should be sprinkled onto systems to boost performance and should not be viewed as a way to replace disk drives. That IBM example had 616,000 IOPS of sustained data rate on the SSD side, which was somewhere around 20 times the data rate for the 96 disks. This was a silly configuration. Don’t use this as your example.
You need to figure out how much of your data is hot–meaning it is being accessed all the time–and how much of it is cold–it is just sitting there, mostly being ignored by applications. Now you have to ask yourself this: do you want to refresh your disk subsystems with new SAS disks and SSDs, or do you want to sprinkle in some SSDs with your SCSI disks to boost I/O performance and to open up some capacity?
To help customers get their brains wrapped around this, IBM came up with some scenarios. In the first scenario, a customer with a Power 570 has 360 35 GB disks with 12.6 TB of aggregate capacity, which takes up a little more of four racks of space. To get the same aggregate IOPS of sustained data rate, IBM says that this customer can move to 32 of the SSDs (for 2.2 TB of capacity) and 48 of the 282 GB SAS disks (for 13.5 TB of capacity), and cram all of this into 36U of space–less than one rack. That is a 4.5 to 1 compression in physical space required for the disks, and it yields 24.6 percent more disk space, too. What IBM’s comparison did not say is that those 32 SSDs would cost $423,520 at list price, even if they did yield an incredible 896,000 IOPS of sustained data rate. The 48 disk drives, by comparison, cost only $55,200, or about $4.08 per GB or about $3.83 per IOPS. To be fair, the SSDs used in the Power 570 cost $190 per GB, but only 47 cents per IOPS. Just moving to 360 of the 146 GB disks could cost $179,280, or about $3.41 per GB. The hybrid disk/SSD scenario IBM laid out delivers around 910,000 IOPS and 15.7 TB of capacity at a cost of $30.36 per GB and 53 cents per IOP. (This scenario assumes a ration of cold to hot data of 6 to 1.)
This is a pretty tough sell for a lot of customers, unless they are I/O bound. IBM’s scenario two seems more likely to play out in the real world. In this scenario, you figure out how much of your data is hot, and you put that data onto SSDs. You start with the same 4.1 racks of 35 GB disks (360 of them), and you assume that about 20 percent of your data is hot. No server completely fills up their disks because filling up disks really slows down transaction processing, and IBM’s scenario two further assumes that only 30 percent of the capacity on the old 35 GB disks actually has data. So while we are talking about 12.6 TB of aggregate raw capacity, we are really only talking about 3.7 TB of actual data on those 360 hard disks. Given this, then you have 756 GB of hot data, which will require a dozen of those SSDs (to give you 836 GB of hot data capacity). So you plug in the SSDs and free up 2.5 TB of raw capacity, 756 GB of real data. Those dozen SSDs cost you $158,820, but if you wanted to get 2.5 TB of additional raw capacity, you’d have to buy nine of the 282 GB disks, which would cost $10,350. So the “real” cost of the SSDs is $158,820 minus $10,350, or $148,470. (That’s my economic analysis, not IBM’s. IBM didn’t say jack about money in any of its scenarios.)
This scenario is really useful if you have leased the gear and change swap out older disks without paying penalties or if you want to avoid buying new disks. The upshot of this scenario is that system throughput should go up significantly and response times should drop, too.
IBM is offering resellers and sales reps a few other examples with rough benchmark data to help customers understand the performance implications of SSDs. As usual, the economics are missing from IBM’s comparisons, so I have added in the money side of the calculations. IBM did not specify the tests that it used on i and AIX boxes, but warned that the tests were designed to stress I/O systems:
Example 1: i 6.1 OLTP/DB workload. Take a system with the new 1.5 GB SAS disk controller (feature number 5904, 5906, or 5908 on Power Systems); this has 1.5 GB of write cache and 1.6 GB of read cache memory on the card and supports 3.5-inch and 2.5-inch SAS and SSD drives that can be used inside of Power Systems and in their I/O drawers. Slap eight 15K RPM disks on the controller. Now compare this to the same controller using SSDs only. The SSD setup does 11 times as many transactions, has 10 times the user response time on transactions, has 19 times better disk response time, and has 11 times the IOPS per device running the test. This all sounds great. But the all-disk subsystem–not including I/O drawers, an assuming it is a Power 550 box–costs $8,500 for the controller and $12,650 for the disks, for a total of $21,650. The SSDs plus the controller costs $167,320. So you get roughly 10 times the performance, but it costs you 7.7 times as much money to get it. (It would probably make more sense to add two SSDs and boost performance by 20 percent or so, right?)
Example 2: AIX 6.1 OLTP/DB workload. Take six of the 1.5 GB SAS controllers, and 36 of the 15K RPM disks, which cover the hot data in a Unix box. Replace this with six SAS controllers and 36 SSDs. On this AIX test, you get 42 times more transactions, 3 times better database response time, 3.5 times better disk response time, and 42 times more IOPS per device. But the same economics are going to apply. If you price out 36 300 GB disks and six of the controllers, it costs $92,400, but it costs $527,460 on a Power 570 for the controllers plus the SSDs. Still, 42X the performance. . . . That’s pretty compelling, unless this is a typo in the IBM docs.
Example 3: i 6.1 OLTP/DB workload. Take four 1.5 GB SAS controllers and 144 disk drives and compare it to four controllers plus 72 disks and 16 SSDs. You have 39 percent fewer drives in the box, you do the same number of transactions, you get between 2 and 2.5 times better end user response time, you get 1.75 times better disk response time, and you crank 2.1 times more IOPS through each device. The all-disk setup costs $199,600 using 282 GB disks, while the disk and SSD combination costs $379,560. That’s a 90 percent premium for doubling response times and reducing disk drive count by 39 percent. That may be a tough sell in a tough economy. Still, it is a little less expensive than doubling the disk count to 288 drives and the controllers to eight of the 1.5 GB SAS controllers to try to double response times. (Not including I/O drawers, that is $399,200.) If response time is an issue, SSDs are the smart move, it seems.
Example 4: i 6.1 OLTP/DB workload. This comparison replaces SCSI disk drives with SSDs on the 1.5 GB SCSI and SAS versions of their respective controllers. Take three of the 1.5 GB SCSI controllers and 108 SCSI disks. You can replace these with two 1.5 GB SAS controllers and a dozen SSDs. You’ll get 18 percent more transaction done, and do so with 89 percent fewer drives. You’ll have 2.5 times better end user response times on transactions, 4.5 times better disk response time, and push 11 times the IOPS per device. You’ll also shell out $175,820 to do this, compared to zero dollars for the SCSI disks you probably have already written off.
It should be clear by now that you really need to do some testing to figure out the right configuration for your workloads and your budget. But this is a start.