Jump to: navigation, search

Difference between revisions of "Reddwarf-MySQL-Replication-and-Clustering-Concepts"

(GET /instances/{instance_id})
(GET /instances/{instance_id})
Line 218: Line 218:
 
to store dynamic information not *just* related to replication or clustering.
 
to store dynamic information not *just* related to replication or clustering.
 
<br/>
 
<br/>
====GET /instances/{instance_id}====
+
=====GET /instances/{instance_id}=====
 
<pre>
 
<pre>
 
{
 
{
Line 268: Line 268:
 
List cluster types:<br/>
 
List cluster types:<br/>
 
Returned json:<br/>
 
Returned json:<br/>
 +
 
==== GET /clustertypes ====
 
==== GET /clustertypes ====
 
<pre>
 
<pre>

Revision as of 19:51, 7 June 2013

Replication

MySQL Master/Slave Replication

How master-slave replication works:

  • Replication in mysql works basically via log shipping.
  • The mysql master logs all data manipulation events in the mysql binary log.
  • While this is happening the slave instances are connected to the master via TCP and receive the same events and write them to their relay log.
  • Binary logging should be turned *on* on the master and *off* on the slaves.
  • If binary logging is turned on on the slave you will suffer massive IO hits as each log entry writes to the relay log, then the table, then the binary log.
  • there are 2 internal mysql threads that exist on the slave, the SQL thread and the IO thread
  • the IO thread is responsible for getting binary events from the master and the SQL thread is responsible for applying them locally.
  • slaves should be read-only, allowing data manipulation statements on slaves will result in corrupt or out of sync slaves


How to know that replication broke:

  • parse the output of "SHOW SLAVE STATUS" on the replicated slave.
  • the IO thread is set to No
  • the SQL thread is set to No
  • the seconds_behind_master is > a set number and isn't reducing over time (this is a conditional case that we need to define).
  • This case is a failure because it means that for whatever reason the slave will never be in sync with the master.
  • This can be due to the slave not having enough resources to keep up or that the master has too many slaves.
  • If ever any of these 3 things are true it is assumed that replication is broken and the slave is to be destroyed and recreated.


Steps to setup replication:

  • No point in rewriting known steps, follow instructions here:
MySQL Replication Documentation


Pros Cons
Standard MySQL Prior to MySQL 5.6 Replication is brittle (See notes on MyISAM)
Very well understood Repairing can be time consuming
No special architecture needed Creating new slaves can require downtime or read locking on the master
Easy to troubleshoot Only works well for scaling reads
Works with all storage engines (see notes on MyISAM) Single threaded, high concurrency on the master can cause issues where slaves can't keep up
Offers simple and complex configurations for read scaling
Opensource tools available for monitoring/troubleshooting
Great for making backup or reporting slaves
Good general purpose replication


MySQL Master/Master Replication

How Master/Master or Multi-master Replication Works

  • Uses the same mechanism used for Master/Slave replication (Log Shipping) but with each slave also being a master to the other node
  • Requires the use of an auto_increment offset and and increment setting to allow writes to multiple replicas
  • Requires that all tables have an auto incremented integer primary key
  • You can use more than 2 masters but then replication must be configured in a circular fashion as MySQL doesn't support multiple replication sources


How to know that replication broke:

  • Same steps as standard MySQL Master/Slave except you must check each master because each master is also a slave of another master but checking status is done via the same method.


Steps to setup replication:

  • No point in rewriting known steps, follow instructions here:
Multi-master Howto


Pros Cons
Standard MySQL Prior to MySQL 5.6 Replication is brittle (See notes on MyISAM)
Very well understood Repairing can be time consuming
Works with all storage engines (see notes on MyISAM) Creating new slaves can require downtime or read locking on the master, even more complex with circular replication
Opensource tools available for monitoring/troubleshooting Difficult to troubleshoot and even more difficult to repair.
Single threaded, high concurrency on a master can cause issues where the slave can't keep up


Tungsten Replicator

Recently fully opensourced the Tungsten Replicator offers a more fine grained access to manipulating mysql replication. The process of configuration is much different but the principles are the same. Tungsten Replicator, from now on referred to as TR, offers inline filters for binary log events and also allows for linking of multiple replication "clusters". While TR provides no *real* clustering features it is used as the basis for a commercial product that provides these features. When reading the TR documentation the word cluster is used very frequently and is somewhat misleading. TR is *NOT* clustering, it is replication only.


How Tungsten Replicator Works

  • Master/slave or Master/Master replication with the ability to link multiple replication groups.
  • TR uses a process either locally on the replication master for making binary log events available over a tcp port or you can run the TR process on the master and read the binary logs from the file system.
  • It is worth noting that you should NEVER run TR on the master without configuring it to read the binary logs directly from the file system otherwise you have a serious impact on IO.
  • The binary logs are relayed to the main TR process which controls where the events go via configuration and filters.
  • When logs are shipped to a slave they are cached to a relay log and then played on the slave then deleted from the relay.
  • This is almost exactly how mysql replication works except for the addition of filtering and you cannot run a separate replication coordinator.


How to know replication is broken

  • TR provides an admin command set that lets you check the replication status on all slaves.
  • This central approach *could* make administering replication easier for each customer.
  • The concepts are so much like MySQL replication in that there are metrics for the log shipping thread, applier thread and how many seconds each slave is behind it's master.


Steps to setup replication

Tungsten Replicator Documentation


Pros Cons
Is a much more elegant way of implementing a log relay It's Java
In depth configuration options for filtering and event destination The required JVM memory size makes it uninstallable on 512MB and 1GB instances
Allows for some *hot* configuration changes Would almost *require* a separate "appliance" for the replicator process
Supports Master/Master and Master/Slave with no special schema requirements No libraries for interaction other than Java so trepctl commands would need to be python wrapped
Backed by a commercial product Did I mention it's JAVA????

Clustering

Clustering is NOT replication and replication is NOT clustering I feel like it is very important to understand that statement. Clustering *may* use replication to place copies of data on all nodes of the cluster but replication is not *required* to make a cluster of a service. Clustering is used to provide one or more of the following:

Fault Tolerance
High Availability
Load Balancing

Clustering in general is a complex topic so I won't try to explain it here but understand that when clustering mysql you must have *AT A MINIMUM* 3 cluster nodes. This prevents a split brain situation from happening by allowing the resolution process to achieve quorum.

With that understood, note that the inclusion of clustering technologies on this page is because clustering in mysql usually replicates some or all of the data in multiple places and therefore may be considered by the uninformed as replication.


Galera Cluster

Galera cluster is quickly becoming the defacto standard for clustering in MySQL. MySQL clustering technologies all implement themselves by using the plugable storage engine infrastructure of MySQL. Galera is the only clustering solution that uses the very commonly used InnoDB storage engine, albeit a patched version. Galera is the underlying technology in Percona Cluster server and MariaDB Cluster server.


How Galera Cluster Works

  • Uses Write-set Replication (wsrep) to provide replication hooks into the underlying storage engine. For mysql this happens at the innodb layer.
  • Uses Galera cluster to replicate the data
  • Requires 3 nodes at minimum.
  • 3 nodes provide the power of 2.x with regard to performance, there is definitely not a 1 for 1 performance gain with each node added.
  • Supports true multi-master synchronous replication.
  • Allows for Multi-DC fault tolerance (by configuration)
  • Scales reads very well
  • Write scaling is described as "pretty good"
  • Will need a reliable and protected backend network for inter-node communications.


How to know clustering is broken

  • All monitoring should be done by polling the wsrep_local_state status variable on each node.
  • Nodes that have failed will automatically be removed from the cluster
  • Replacement nodes will be placed back into the cluster when configured to attach to it.
Pros Cons
Synchronous replication InnoDB Only
Active-active multi-master topology As the number of write masters increases so does the rollback rate. This can introduce significant latency.
Read and write to any cluster node Suffers from weakest link syndrome. The cluster will tend to perform as poorly as it's slowest node so it is important to make sure all nodes are of equal size.
Automatic membership control, failed nodes drop from the cluster More processes to manage and monitor
Automatic node joining While this seems to be an emerging industry standard finding people who are subject matter experts is rather difficult.
True parallel replication, on row level
Direct client connections, native MySQL look & feel
Seems to be adopted by the industry

MySQL NDB Cluster

With Galera outperforming NDB in almost every aspect by 200 to 400% there's no reason to look at this right now unless there is a specific need for it's feature set.


Steps to setup clustering

MySQL NDB Cluster Setup Docs


Notes

  • MyISAM: The MyISAM engine is not transaction safe and therefore not crash safe. There is a common misconception that MyISAM breaks replication and this is simply not true. A crashed host using MyISAM will cause replication to fail, but the failure lies in the fact that the host crashed not that MyISAM is used. Where MyISAM makes replication recovery difficult is that crash recovery is a manual process and therefore leaves replication in a known bad state until recovered. The important thing to note here is that simply creating a table using the MyISAM storage engine on a replicated host does *not* immediately break replication. While Oracle MySQL hasn't done much to improve MyISAM's crash recovery and fault tolerance the folks at MariaDB are pushing toward that.
  • InnoDB: The InnoDB storage engine is the industry standard default engine used for database operations because it is ACID compliant. The engine isn't as performant as other storage engines because of the overhead involved in being ACID compliant so it is not ideal for high concurrent insert operations. Where it shines is with it's locking mechanism because it only locks rows that the current query needs to modify. Be aware that some configurations for InnoDB and InnoDB with Galera cluster can result in data loss.
  • Galera Enabled Packages: One interesting point I brought up to both Percona and MariaDB folks was that I know there are separate packages for galera enabled servers but in the spirit of simple is better was it possible to run galera enabled server packages as standalone so that *if* in the future a user wanted to enable clustering they could. The short answer is yes, you can with two drawbacks. Galera enabled servers go through the same regression testing standard server packages do but most people don't run them in standalone so it is possible to find bugs related to standalone that others haven't seen yet. The release cycle for bug fixes is a bit slower than standard packages because they have to coordinate their standard server bug fixes with codership's bug release cycle so there is a bit of inherit lag.


Discussion Points

Replication is proposed to be a base concept that applies to the instance and separate from /clusters. You can create an instance as usual in the current model and that instance will be a Master (Read/Write). You can create additional instances and make that instance a replica of the Master. The replica could be a slave (in the case of M/S), or a Master in the case of Master / Master. In the case of M/S, the slave must have the ability to become the Master, either if the Master dies (would be done automagically), or if the user decides to manually promote a slave to a master. The promotion of slave to a master could be done via actions on the instance. It would be an action that only applies to an instance that is a slave. We would need a way to identify an instance as a master or a slave and would need a way to reference the master from the slave in the instance details. In the case of Master / Slave, we would not allow a user to chain slaves off of slaves (we may want to consider the API allowing this, but something we choose not to support as an operator). Master's / Slave's need to be able to exist across AZ's.

Things to think about:

  • Is it worth doing master/slave replication or would jumping right into multi master clusters the right thing to do?
  • Future thinking... a cluster is really it's own *thing*, it would or may have it's own ip address and dns name (think loadbalancer concept) and a cluster should have children objects that are instances. Should we allow instance actions on the individual instances like resizing knowing that it has the potential to break or degrade performance?
  • Master / slave has many uses beyond read scaling, i.e. backup host, reporting host, etc.

Ideas for API

Proposed addition to instance object: Adding an "attributes" or some other cleverly named field that allows us to store dynamic information not *just* related to replication or clustering.

GET /instances/{instance_id}
{
    "instance": {
        "created": "2013-05-08T22:43:34", 
        "flavor": {
            "id": "1", 
            "links": [
                {
                    "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/flavors/1", 
                    "rel": "self"
                }, 
                {
                    "href": "https://ord.databases.api.rackspacecloud.com/flavors/1", 
                    "rel": "bookmark"
                }
            ]
        }, 
        "hostame": "f7e68ccf2da64bb8a75bafe0dd3a0080.rackspaceclouddb.com",
        "id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080", 
        "links": [
            {
                "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/instances/f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080", 
                "rel": "self"
            }, 
            {
                "href": "https://ord.databases.api.rackspacecloud.com/instances/f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080", 
                "rel": "bookmark"
            }
        ], 
        "name": "slave_instance", 
        "status": "ACTIVE", 
        "updated": "2013-05-08T22:43:34", 
        "volume": {
            "size": 20, 
            "used": 0.16368598397821188
        },
        "attributes": {
            "cluster": {
                "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
                "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
                "role": "slave",
                "master": "fbd14327-230c-495e-b6d9-cb593ccd4cbb"
            }
        }
    }
}

List cluster types:
Returned json:

GET /clustertypes

{
    "clusterTypes": [
        {
            "id": "7782954c-ebec-42f0-894b-d3601016a91e",
            "links": [
                {
                    "href": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
                    "rel": "self"
                },
                {
                    "href": "https://service/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
                    "rel": "self"
                }
            ],
            "name": "Master/Slave Replication",
            "type": "master-slave"
        }
    ]
}

Create Replication Set: (No previous db instance, fresh)
This will create 1 master with 2 replicas.

POST /clusters

{
    "cluster": {
        "nodes": 3,
        "flavorRef": "https://service/v1.0/1234/flavors/1", 
        "name": "replication_set_1", 
        "volume": {
            "size": 2
        },
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e"
        }
    }
 }

Create Replication Set: (Previous db instance)
Create a replication set with primaryNode as the model for instance flavor and volume size and it will server as the "Master" node if the cluster type is master/slave or equivalent

POST /clusters

{
    "cluster": {
        "nodes": 3,
        "name": "replication_set_1",
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
            "primaryNode": "https://service/v1.0/1234/instances/d3ca44b6-1ba2-4862-b16d-1b2cd115944d"
        },
    }
}

List all clusters and replications sets:

GET /clusters

Returned JSON object:

{
    "clusters": [
        {
            "clusterConfig": {
                "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
            },
            "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
            "instances": [
                {"id": "fbd14327-230c-495e-b6d9-cb593ccd4cbb"},
                {"id": "7a369ce2-0cfb-4bcd-b342-370a4ae55c09"},
                {"id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080"}
            ]
        }
    ]
}

List cluster or replication set:

{
    "cluster": {
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
        },
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "instances": [
            {"id": "fbd14327-230c-495e-b6d9-cb593ccd4cbb"},
            {"id": "7a369ce2-0cfb-4bcd-b342-370a4ae55c09"},
            {"id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080"}
        ]
    }
}

List cluster or replication set details:

{
    "cluster": {
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
        },
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "instances": [
            {
                "flavor": {
                    "id": "1",
                    "links": [
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/flavors/1",
                            "rel": "self"
                        },
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/flavors/1",
                            "rel": "bookmark"
                        }
                    ]
                },
                "id": "fbd14327-230c-495e-b6d9-cb593ccd4cbb",
                "links": [
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/instances/fbd14327-230c-495e-b6d9-cb593ccd4cbb",
                        "rel": "self"
                    },
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/instances/fbd14327-230c-495e-b6d9-cb593ccd4cbb",
                        "rel": "bookmark"
                    }
                ],
                "name": "master_instance",
                "status": "ACTIVE",
                "volume": {
                    "size": 20
                },
                "attributes": {
                    "cluster": {
                        "role": "master"
                    }
                }
            },
            {
                "flavor": {
                    "id": "1",
                    "links": [
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/flavors/1",
                            "rel": "self"
                        },
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/flavors/1",
                            "rel": "bookmark"
                        }
                    ]
                },
                "id": "7a369ce2-0cfb-4bcd-b342-370a4ae55c09",
                "links": [
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/instances/7a369ce2-0cfb-4bcd-b342-370a4ae55c09",
                        "rel": "self"
                    },
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/instances/7a369ce2-0cfb-4bcd-b342-370a4ae55c09",
                        "rel": "bookmark"
                    }
                ],
                "name": "master_instance",
                "status": "ACTIVE",
                "volume": {
                    "size": 20
                },
                "attributes": {
                    "cluster": {
                        "role": "slave",
                        "master": "fbd14327-230c-495e-b6d9-cb593ccd4cbb"
                    }
                }
            },
            {
                "flavor": {
                    "id": "1",
                    "links": [
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/flavors/1",
                            "rel": "self"
                        },
                        {
                            "href": "https://ord.databases.api.rackspacecloud.com/flavors/1",
                            "rel": "bookmark"
                        }
                    ]
                },
                "id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080",
                "links": [
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/v1.0/1234/instances/f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080",
                        "rel": "self"
                    },
                    {
                        "href": "https://ord.databases.api.rackspacecloud.com/instances/f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080",
                        "rel": "bookmark"
                    }
                ],
                "name": "master_instance",
                "status": "ACTIVE",
                "volume": {
                    "size": 20
                },
                "attributes": {
                    "cluster": {
                        "role": "slave",
                        "master": "fbd14327-230c-495e-b6d9-cb593ccd4cbb"
                    }
                }
            },
        ]
    }
}

Drop node from cluster:

{
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "instances": [
            /* The proposal here would be to allow multiple
            instances to be removed if the clusterType
            supports it. Think "I have a master and 2 slaves
            for christmas traffic but no longer need the
            slaves." */
            {"id": "7a369ce2-0cfb-4bcd-b342-370a4ae55c09"},
            {"id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080"}
        ]
    }
}

Or, only one at a time? Or both?

{
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "instance": {
            "id": "f7e68ccf-2da6-4bb8-a75b-afe0dd3a0080"
        }
    }
}

Or just reduce the number of nodes and let the system decide?

{
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "nodes": 2
    }
}

Add node to cluster/replication:
Add nodes by just increasing the node count?

{
    /* adding nodes to the cluster, it is assumed that
    you are bringing an uncreated instance into the
    the cluster of servers.  If nodes are of different
    sizes then default to the size of the master/primary node */
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "nodes": 5,
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
        }

    }
}
{
    /* Add nodes to the cluster by giving specs on
    the node to add */
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
        },
        "instance": {
            "flavorRef": "https://service/v1.0/1234/flavors/1",
            "volume": {
                "size": 40
            }
        }
    }
}

Promote a slave node to master:

{
    "cluster": {
        "id": "a66cab3a-041b-4c17-82e7-7ca5333a18f3",
        "clusterConfig": {
            "type": "https://service/v1.0/1234/clustertypes/7782954c-ebec-42f0-894b-d3601016a91e",
        },
        ...
    }
}