MySQL LOAD DATA INFILE: Better Server, Worse Performance
I am testing out Microsoft Azure Database for MySQL and have run into a performance issue that I do not understand. I launched a "Basic" server with 1 vCore (2 GB RAM, "Standard Storage"), which is their lowest possible tier of server. I created a database, a table, and imported about 4 million rows (30 GB) with LOAD DATA INFILE. It took 56 minutes. Next, I launched a "Memory Optimized" server with 8 vCores (80 GB RAM, "Premium Storage"). I repeated the exact same tasks and loaded the exact same file. This time it took 7 hours and 16 minutes. Better server, much worse performance (on this task) -- not what I was expecting. To be certain I had not made a mistake, I repeated the steps above, but I got almost the exact same results again. I suspect the issue is that the Memory Optimized server has different default server parameters than the Basic server which make this task perform more slowly (I haven't changed the parameters from the defaults that Azure sets). But I am not sure which parameters are the culprit. If anyone has insight into this issue, I'd appreciate it. Basic server parameters: http://pastebin.zone/wRniyPm6 Memory Optimized server parameters: http://pastebin.zone/phuDcZj4
Here’s what seems to have been causing this behavior:
Per the Azure documentation, the Basic tier server on Azure comes with “variable” IOPS whereas the Memory Optimized server comes with a fixed IOPS which is based on the amount of storage assigned to the database server.
I had 100GB assigned to the Memory Optimized server. This resulted in it having 300 IOPS, in accordance with Azure’s 3 IOPS / GB ratio.
Presumably the “variable” IOPS on the Basic server ended up being significantly more than the 300 IOPS that the Memory Optimized server had.
Lesson learned: to get fast storage access on Azure Database, you need to assign plenty of storage capacity to your server (even if you don’t need that much storage!).
Suggestion for your AWS Paramenters Group when you are LOADing millions of rows of data,
innodb_change_buffer_max_size=50 # from 25 for improved LOAD speed during high volume process
when done, back to 25% (or less) depending on your need for typical operation.
On your Memory Enhanced instance,
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
For next test, these should reduce elapsed time.