Jump to: navigation, search

Difference between revisions of "Reddwarf-MySQL-Replication-and-Clustering"

(Galera Cluster)
(Replaced content with "Documentation how replication and clustering in Reddwarf works should go here.")
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Replication =
+
Documentation how replication and clustering in Reddwarf works should go here.
 
 
== MySQL Master/Slave Replication ==
 
 
 
'''How master-slave replication works:'''
 
* Replication in mysql works basically via log shipping. 
 
* The mysql master logs all data manipulation events in the mysql binary log. 
 
* While this is happening the slave instances are connected to the master via TCP and receive the same events and write them to their relay log. 
 
* Binary logging should be turned *on* on the master and *off* on the slaves. 
 
* If binary logging is turned on on the slave you will suffer massive IO hits as each log entry writes to the relay log, then the table, then the binary log.
 
* there are 2 internal mysql threads that exist on the slave, the SQL thread and the IO thread
 
* the IO thread is responsible for getting binary events from the master and the SQL thread is responsible for applying them locally.
 
* slaves should be read-only, allowing data manipulation statements on slaves will result in corrupt or out of sync slaves
 
 
 
 
 
'''How to know that replication broke:'''
 
* parse the output of "SHOW SLAVE STATUS" on the replicated slave.
 
* the IO thread is set to No
 
* the SQL thread is set to No
 
* the seconds_behind_master is > a set number and isn't reducing over time (this is a conditional case that we need to define). 
 
* This case is a failure because it means that for whatever reason the slave will never be in sync with the master.
 
* This can be due to the slave not having enough resources to keep up or that the master has too many slaves.
 
* If ever any of these 3 things are true it is assumed that replication is broken and the slave is to be destroyed and recreated.
 
 
 
 
 
'''Steps to setup replicaiton:'''<br />
 
* No point in rewriting known steps, follow instructions here: [http://dev.mysql.com/doc/refman/5.1/en/replication-howto.html MySQL Replication Documentation]
 
 
 
 
 
'''Pros/Cons'''
 
{| class="wikitable"
 
|-
 
! Pros !! Cons
 
|-
 
| Standard MySQL || Prior to MySQL 5.6 Replication is brittle (See notes on MyISAM)
 
|-
 
| Very well understood || Repairing can be time consuming
 
|-
 
| No special architecture needed || Creating new slaves can require downtime or read locking on the master
 
|-
 
| Easy to troubleshoot  || Only works well for scaling reads
 
|-
 
| Works with all storage engines (see notes on MyISAM) || Single threaded, high concurrency on the master can cause issues where slaves can't keep up
 
|-
 
| Offers simple and complex configurations for read scaling ||
 
|-
 
| Opensource tools available for monitoring/troubleshooting ||
 
|-
 
| Great for making backup or reporting slaves ||
 
|-
 
| Good general purpose replication ||
 
|}
 
 
 
 
 
== MySQL Master/Master Replication ==
 
 
 
''' How Master/Master or Multi-master Replication Works '''
 
* Uses the same mechanism used for Master/Slave replication (Log Shipping) but with each slave also being a master to the other node
 
* Requires the use of an auto_increment offset and and increment setting to allow writes to multiple replicas
 
* Requires that all tables have an auto incremented integer primary key
 
* You can use more than 2 masters but then replication must be configured in a circular fashion as MySQL doesn't support multiple replication sources
 
 
 
 
 
'''How to know that replication broke:'''
 
* Same steps as standard MySQL Master/Slave except you must check each master because each master is also a slave of another master but checking status is done via the same method.
 
 
 
 
 
'''Steps to setup replicaiton:'''
 
* No point in rewriting known steps, follow instructions here: [http://www.onlamp.com/2006/04/20/advanced-mysql-replication.html Multi-master Howto]
 
 
 
 
 
'''Pros/Cons'''
 
{| class="wikitable"
 
|-
 
! Pros !! Cons
 
|-
 
| Standard MySQL || Prior to MySQL 5.6 Replication is brittle (See notes on MyISAM)
 
|-
 
| Very well understood || Repairing can be time consuming
 
|-
 
| Works with all storage engines (see notes on MyISAM) || Creating new slaves can require downtime or read locking on the master, even more complex with circular replication
 
|-
 
| Opensource tools available for monitoring/troubleshooting  || Difficult to troubleshoot and even more difficult to repair.
 
|-
 
|  || Single threaded, high concurrency on a master can cause issues where the slave can't keep up
 
|}
 
 
 
 
 
 
 
== Tungsten Replicator ==
 
 
 
''' How Tungsten Replicator Works '''
 
 
 
''' How to know replication is broken '''
 
 
 
''' Pros/Cons '''
 
 
 
 
 
= Clustering =
 
 
 
''' Clustering is NOT replication and replication is NOT clustering '''
 
I feel like it is very important to understand that statement.  Clustering *may* use replication to place copies of data on all nodes of the cluster but replication is not *required* to make a cluster of a service.  Clustering is used to provide one or more of the following:
 
Fault Tolerance
 
High Availability
 
Load Balancing
 
 
 
Replication in general is a complex topic and I won't try to explain it here but understand that when clustering mysql you must have *AT A MINIMUM* 3 cluster nodes.  This prevents a split brain situation from happening by allowing the resolution process to achieve quorum. 
 
 
 
With that understood, note that the inclusion of clustering technologies on this page is because clustering in mysql usually replicates some or all of the data in multiple places and therefore may be considered by the uninformed as a form of replication.
 
 
 
 
 
== Galera Cluster ==
 
Galera cluster is quickly becoming the defacto standard for clustering in MySQL.  MySQL clustering technologies all implement themselves by using the plugable storage engine infrastructure of MySQL.  Galera is the only clustering solution that uses the very commonly used InnoDB storage engine, albeit a patched version.  Galera is the underlying technology in Percona Cluster server and MariaDB Cluster server.
 
''' How Galera Cluster Works '''
 
 
 
''' How to know clustering is broken '''
 
 
 
''' Pros/Cons '''
 
 
 
 
 
== MySQL NDB Cluster ==
 
 
 
''' How NDB cluster works '''
 
 
 
''' How to know clustering is broken '''
 
 
 
''' Pros/Cons '''
 
 
 
''' Notes: '''
 
* '''MyISAM:''' The MyISAM engine is not transaction safe and therefore not crash safe.  There is a common misconception that MyISAM breaks replication and this is simply not true.  A crashed host using MyISAM will cause replication to fail, but the failure lies in the fact that the host crashed not that MyISAM is used.  Where MyISAM makes replication recovery difficult is that crash recovery is a manual process and therefore leaves replication in a known bad state until recovered.  The important thing to note here is that simply creating a table using the MyISAM storage engine on a replicated host does *not* immediately break replication.  While Oracle MySQL hasn't done much to improve MyISAM's crash recovery and fault tolerance the folks at MariaDB are pushing toward that.
 

Latest revision as of 16:16, 2 May 2013

Documentation how replication and clustering in Reddwarf works should go here.