Friday, June 3, 2011

Smart Selects for replicated systems


Over the last few years I have worked with a few different companies and their replication databases. Your replication choice is yours and what your needs are. I have enjoyed relayed replication and the reason being is that it gives you the chance to make your applications even smarter. 

One reason is that if you did loose a master your slave can become the new master and you still have replication active as you rebuild the down server.

Another reason is the options it can give you for selects. I have seen some systems that have a replicated slave that is never touched. Granted safe, but money spent on hardware and users complaining about application speed. Whichever scripting language you prefer, you can use all of your replicated databases to your advantage. To be safe all writes should to go the master unless you can control and confirm that nothing from the master would try to write to the same location. Selects however can be expanded.

Often users complaints are speed. Everything must be faster. If your database handle only connects to 1 database then your are limited to the performance from that once database. Java has connection pooling (http://java.sun.com/developer/onlineTraining/Programming/JDCBook/conpool.html) and PHP can be used to do the same task. Personally, although I bias, I feel you will have more control with PHP.

The concept is simple, of course you do not want to hurt performance by unnecessary overhead but these steps are and should be very fast. 
  1. connect to the slave
  2. execute SHOW SLAVE STATUS
  3. check 'Seconds_Behind_Master'
  4. Based on threshold you decide you choose which database your going to read from
    1. You can pass this to your connection handle so you have thresholds per query if needed.
    2. You can also create smart connections to never connect to slave if query does not start with select, options are up to you.

This allows you to keep slower data intense queries off the master server. Your options are up to you and what you want to do. How you choose your thresholds depends on your application. How you define what can select from master versus slave only is dependent on your application as well. This is not a replacement for optimization but just expanding your options to get the data your after.

We used three databases in a relayed replication InnoDB environment. We where able to check the third database first , based on thresholds move up to the second database if needed, and lastly run a select on the master if required. Typically the seconds behind master always 0 or under 5 seconds with this TB+ db. If something major was going on then the third database would be further back and the selects would roll to the second database naturally.

This worked very well for our application at the time, and I have used the concept with more than one company. Everything is of course dependent on your application but getting valid results fast to users is a common issue we all face.