Wednesday, April 13, 2011

MySQL Relayed Replication Solution

Once I had a situation where MySQL cluster was not an option.  I still wanted replication and redundancy. 
What I built was a relayed replication solution. 

I had a MYSQL DB running the INNODB engine that contained over a TB of data. 

MySQL master server ( SERVER_A ) that replicated to the slave ( SERVER_B ).
SERVER_B was also a slave that then replicated to a second slave (SERVER_C).
 I would have preferred to add a server_d but budget did not allow. 

What did this do for us?  Well this structure allowed us few different options.
First we had redundancy hardware and with replication. If SERVER_A crashed we could update the masterDB reference, which was just an internal dns reference, to SERVER_B and the site was up and running with little downtime. We could then do some research on SERVER_A to find what the problem was. Once this box (server_a) was ready for rotation again we would make it a slave to SERVER_C since SERVER_B was the new master.  
This round robin of servers also allowed us to add database updates in a manor that updated the slaves first and near zero downtime for an update to the master. We just rotated the order of the databases. 
Another benefits of this hardware setup was we where able to build out smart database handles. So based on your query and related settings we could decide which database would handle the select. All writes went to the master of course.  This allowed the big queries to stay out of the way of other selects and resulted in few if any deadlocks or slow responses. 

So for example: 

Then after a database rotation:

If you are running a Master with two to three slaves off it I would ask why? Once you loose the master all of your slaves are also down. With relayed replication you can keep replication in case of a master crash. The worst case scenario is when the slave in the middle crashes. The best solution for this would be a forth box so you can skip down further and still have replication up.

With optimized queries you can keep the seconds behind master at 0 and diversify your select load to enable a robust footprint that has built in replication redundancy.

The Data Charmer also talks about  Relayed replication