Lessons learned optimizing MySQL

This week I’ve been quite entertained at work optimizing a big MySQL database, 18M rows and about 10Gb of table data for a high traffic web site (11M unique views per month). Not optimizing for performance is overkill in this case, so I have been learning and enjoying a lot during these days.

The first thing to note is that database servers are complex beasts (don’t underestimate them) and each and every database provider has its own strengths but also its own caveats.

MySQL logo

Sooner or later you’ll face a behavior that may seem counter-intuitive and there’s when you must read a lot of documentation. In this case, MySQL Performance Blog and MySQL documentation have been of great help, as well as the now unmantained-but-still-useful Hack MySQL site, which provides great tips very well explained.

Second. As always, doing an 80% optimization consumes 20% of the time and most of it is quite straightforward. EXPLAIN, review results, create index, EXPLAIN again. This way you can do a good job just knowing some indexing basics. Your indexes may be a bit fat, but will do the job.

Then you’ll need to be patient, creating indexes is not always fast, in fact the number, size and heterogeneity of the columns involved will increase the time it takes to create them. While you wait, the best is to read good books about MySQL or relational databases like: Relational Database Index Design and the Optimizers.

You’ll need to be organized. Creating indexes randomly just looking a couple of slow SELECTs is something, but you must look at the big picture: make an inventory of all the application queries, no matter how unimportant look like because if they attack a big table with a WHERE or an ORDER BY they will perform bad. Fact. This list will help you organize the indexes and see which ones are used and which not and where.

Learn as well what the configuration parameters mean, MySQL has a bunch of them. Just increasing numbers will not always solve your bottlenecks. If you see something strange, there might be a parameter misconfigured and causing bad results.

Sometimes you’ll find yourself frustrated by something that you can’t understand. I found that almost anything in this field has an explanation, even a reasonable one.

Finally, don’t overlook other thinks. Database performance is important but there may be a lot of room to improve in your code, don’t blame MySQL as the root of all problems!

Related Posts

Bye bye Kurobox

A deserved farewell to a device that has worked flawlessly for so many years

DIY Arduino Christmas tree lights

A detailed explanation of my home made light switcher made with Arduino and a relay shield

Back to blogging

My statement on getting back to blogging. Will this go anywhere?

Simple Mini Profiler Glimpse plugin

I just created my first Glimpse plugin, integration between Mininprofiler and Glimpse.

Carbon Copy Cloner saved my day

Just a gratitude post because this piece of software was extremely useful when others don't. And it was my birthday.

Troubleshooting MSDTC, RPC and NServiceBus issues

MSDTC is rare and can cause many headaches. This is the process I followed to track down the problems I was having

Using fail2ban with nginx in Debian

fail2ban helps you fight spam and bots but comes with an Apache sample. I converted it to handle Nginx information.

How msiinv saved my day

mssinv is a tool to manage MSI installed packages and I used it to track a problem with a package partially installed, that had to be removed with this tool

Install ASP.NET MVC 3 Manually

Due to some strange problems in my PC, I had to install this package manually and this is applicable to many other installers

Deployment web site Zip with MsBuild and TeamCity

I needed to make a Zip file of a web site and automate it with TeamCity. I used MsBuild and some obscure tricks