Author: Travis Newman

  • Galera test cluster created

    Well that was freakishly easy.  Thanks again Digital Ocean.  Seemed easier to set up than master-slave replication and this is way more featureful in that you can write to any node in the cluster, not just the master.

    There may be a way to override this, but its like every Galera node is the same server.  With a master-slave setup, you have to be careful about individual databases.  I suppose this could be useful if you wanted to have a master put databaseA on slave1, and databaseB on slave2, but from the Digital Ocean tutorial on Galera, this is way simpler.  Really blown away by how easy this was to set up.

  • I got to learn some High Availability things

    As part of a job opportunity, the interviewer assigned some homework.  The expectations were as follows:

    • Redundant webservers
    • Redundant database servers
    • Read-write splitting between the webservers and the database servers

    And ~24 hours later, I have achieved this.  This is spread across six virtual machines:

    • HAProxy server
    • Apache server 1
    • Apache server 2
    • ProxySQL server
    • MySQL master
    • MySQL slave

    Setting up the webservers was simple enough (apt install apache2).

    To setup HAProxy, I followed this guide:  https://www.digitalocean.com/community/tutorials/how-to-use-haproxy-to-set-up-http-load-balancing-on-an-ubuntu-vps

    To set up master-slave replication, I used this guide:  https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-in-mysql

    To set up ProxySQL load balancing and read/write splitting, I followed this guide:  https://proxysql.blogspot.tw/2015/09/proxysql-tutorial-setup-in-mysql.html and to a lesser extent http://www.proxysql.com/blog/configure-read-write-split

    I’m sure there’s still some tuning and ironing out to do to make things super smooth, but the mechanics work as expected.

    http://ha.travnewmatic.com/ loads a page that shows which webserver you’re using.  If you refresh like crazy and the IP doesn’t change, that’s to be expected.  Sessions stickiness is implemented so if you access the page within a certain interval, you’ll continue to talk to the same webserver.

    http://ha.travnewmatic.com/session.php has more information about the cookie you’re getting.  Both this and the previous URL are products of the Digital Ocean guide on setting up HAProxy.  Its set using the “Cookie insert method.”

    http://ha.travnewmatic.com/wp/ is a vanilla WordPress install sending data through the ProxySQL server to both the master and slave databases.

    This screencap shows the unbalanced send/receive traffic going to and from the master and slave MySQL servers:

    There were a few F5’s between issuing those two commands.  Queries to the master went up by 14, while queries to the slave went up by 259.

    Current limitations:  In this topology, ProxySQL represents a single point of failure.  This guide describes a setup involving multiple ProxySQL servers, and a Galera database cluster.  While this was mentioned but not required for my assignment, this setup makes use of virtual IP’s (which at the moment I do not fully understand).  The idea is that you can specify a single IP in the web app (i.e. the host that the web application looks to for database services), but depending on what’s happening, that virtual IP could refer to different hosts.  The goal is that, in the event of one ProxySQL server going down, the other one takes over, but the web app continues to have database access like nothing happened.

    The other limitation involves the master/slave setup and how my ProxySQL server does query routing.  I’m not sure if this can be conditional (ex. if the slave goes down, use the master for everything).  If either the master or slave go down, nothing works (though this could be partially mitigated by having multiple slaves, but if the master goes down, I’m screwed).  Galera seems to be an appropriate solution to this problem, though I haven’t played with Galera yet.  In my setup the slave is meant to be mainly read-only, and the master gets the writes.  From what little I’ve read about Galera, every db server in the cluster is a master that could be used for reads and writes.

    These are technologies I could use for my own self, but considering that everything I have in Dallas is on the same physical server, investing a ton of time into HA for my own services would not be a very fruitful endeavor.

    Replacing VM’s with containers would drastically reduce space requirements and allow for easier backup in the event poo does hit the fan.  Downtime for me isn’t a huge deal.  Having to set stuff up again is a bigger deal.  So far so good though.  My ZFS array is healthy and things have been ridiculously stable (knock on wood) so I’m not so worried about lack-of-availability.