Four disks – RAID 10 or two mirrored pairs?
I have this discussion with developers quite often. The context is an application running in Linux that has a medium amount of disk I/O. The servers are HP ProLiant DL3x0 G6 with four disks of equal size @ 15k rpm, backed with a P410 controller and 512MB of battery or flash-based cache. There are two schools of thought here, and I wanted some feedback... 1). I'm of the mind that it makes sense to create an array containing all four disks set up in a RAID 10 (1+0) and partition as necessary. This gives the greatest headroom for growth, has the benefit of leveraging the higher spindle count and better fault-tolerance without degradation. 2). The developers think that it's better to have multiple RAID 1 pairs. One for the OS and one for the application data, citing that the spindle separation would reduce resource contention. However, this limits throughput by halving the number of drives and in this case, the OS doesn't really do much other than regular system logging. Additionally, the fact that we have the battery RAID cache and substantial RAM seems to negate the impact of disk latency... What are your thoughts?
My thoughts are that performance talks and bullspit walks.
Since you’re discussing recreating the array, why not do it both ways and run test load on it, then graph the results?
If, as you say in your comment, that real life shows that performance doesn’t depend on the underlying RAID level (and you actually do have the numbers to back it up), then I would recommend going with the level that will continue to give you the greatest performance as you scale, which is almost always going to be RAID-10.
If you’re not going to expand, then it literally doesn’t matter, since you apparently get the same performance from either option, and you’re painting the bikeshed.
Since nobody presented any real world data regarding this question, I would like to point out to an interesting article regarding the same exact question!
From the article:
“As much as we love raw sequential throughput, it’s almost worthless for most database applications and mail servers. Those applications have to deal with hundreds or even thousands of tiny requests per second that are accessed at random on the hard drives. The drives literally spend more than 95% of the time seeking data rather than actually reading or writing data, which means that even infinite sequential throughput would solve only 5% of the performance problem at best.”
Food for thought: If you decide to go with 4 disk RAID-10, then at any point in the future, the developers can point the finger back to you when a performance issue arises. If you go with two sets and then a performance issue arises, then you might have the option of changing over to a four disk RAID-10, depending on how complex this conversion would be.