Friday, May 13, 2011

Taking a look at MySQL Cluster -- and saving money in the long run

Over the last few years I have worked for few different companies that utilized versions of relayed replication with the MySQL databases. Their data handler connections where done with Java connection pooling or custom built PHP/PERL code. This allowed each company to spread the load of heavy queries and maintain a decent up time based on the load of their servers. One of the companies comes to mind often when I think of the MySQL cluster.

The company was growing and we made plans to move out of leased servers and build out our racks in a co-location center. This would have been a great time to think ahead and move towards or at the very least heavily evaluate a MySQL Cluster solution. I often got push back on this topic. Granted hindsight is always 20/20. Everything was new and fresh and we could have tested and benchmarked before live traffic was on the systems. Instead we continued with a basic MySQL replication set up (1 master --> 2 slaves). It did not take us long before we adjusted the databases to a relayed replication environment (1 master --> slave --> slave) . As we advanced with this environment we built out custom PHP handles that could take thresholds per query and allowed us to pick and choose which database the query would run on based on server load and slave lag. This allowed us to keep heavy queries off the master easily. While the relayed replication setup improved our stability and reduced our downtime drastically, downtime was still a concern and issue at times.

Looking back I think the push back against a MySQL cluster was more related to costs of the additional servers versus the primarily three servers we ran on. Was that really the best choice? If we would have invested ~20k to ~40k more in additional beefy database servers ( ~20k per box at the time) we potentially could have saved development time & costs,and had a stronger and more stable database. Saving these two items alone could have helped us advance the platform more, since we had more development cycles and been more profitable due to almost no downtime. MySQL cluster is focus is 99.999% availability. I am willing to bet that the savings gained from going to a cluster then would have been more than the ~20k to 40k we saved.

To make looking back even harder, the minimum high availability MySQL cluster set up is three servers, exactly what we had anyway. I would have liked to make it more robust and have at least one other server but even that was not required. If we would have taken a step back, and evaluated our options and not based our opinion on assumptions, I feel we would have avoided the issues we ended up developing around.

I had a great time pushing with all of the development we did at the time. But was it best for the company?

Does that mean MySQL cluster is the answer for everything, no. Does it mean it is well worth taking a break and stepping back to truly evaluate all of your options, big or small company, yes.

Numerous examples of setting up a cluster are available online, it is worth a look.