Tag: ha

I got to learn some (more) High Availability things

In the spirit of continuous improvement, I made another HA setup.

Same basic front-end with two webservers behind one HAProxy load balancer, but with way more fault tolerance behind them.

I used the guide I mentioned in the previous HA post to create this new setup.

Webservers look to 192.168.100.2 for a database connection. This IP can exist on one of three ProxySQL servers thanks to keepalived. keepalived does two things (at least in my setup): it checks to see if the ProxySQL process is running, and it talks to other keepalived servers. On the keepalived server that is the master, keepalived puts 192.168.100.2 on an interface. On keepalived slave servers, 192.168.100.2 is not assigned. On the master, if the ProxySQL process ceases to exist, keepalived removes 192.168.100.2 and one of the slave servers puts 192.168.100.2 on one of their interfaces. This happens in a matter of seconds. 192.168.100.2 goes away on the server that had it almost as soon as I kill ProxySQL and then magically shows up on one of the other ProxySQL servers. The technical term for this is a “virtual IP” but I think “conditional IP” is more appropriate. A servers ability to have that IP is conditional on 1) ProxySQL is running and 2) no other server has that IP. If both of those conditions exist, then that server can have that IP.

Because I’m using Galera instead of master-slave replication, I can distribute all queries from the webservers across all databases. In production, depending on the application, this may not be appropriate, but as an exercise for me, this is OK. From the little research I’ve done, it is possible to confuse a Galera cluster, but only under very high load, considerable latency between the nodes, and with conflicting queries. For my purposes, distributing my dinky queries across all three nodes hasn’t produced any issues (i.e. my test wordpress refreshes just fine).

My next idea is to apply keepalived treatment to the front end. That is, have two HAProxy load balancers in front of the webservers, and have keepalived pass the public IP back and forth between them as required.

Test wp on the new setup with http://ha2.travnewmatic.com/wp/

June 11, 2017
Galera test cluster created

Well that was freakishly easy. Thanks again Digital Ocean. Seemed easier to set up than master-slave replication and this is way more featureful in that you can write to any node in the cluster, not just the master.

There may be a way to override this, but its like every Galera node is the same server. With a master-slave setup, you have to be careful about individual databases. I suppose this could be useful if you wanted to have a master put databaseA on slave1, and databaseB on slave2, but from the Digital Ocean tutorial on Galera, this is way simpler. Really blown away by how easy this was to set up.

June 10, 2017
I got to learn some High Availability things
As part of a job opportunity, the interviewer assigned some homework. The expectations were as follows:
- Redundant webservers
- Redundant database servers
- Read-write splitting between the webservers and the database servers
And ~24 hours later, I have achieved this. This is spread across six virtual machines:
- HAProxy server
- Apache server 1
- Apache server 2
- ProxySQL server
- MySQL master
- MySQL slave
Setting up the webservers was simple enough (apt install apache2).

To setup HAProxy, I followed this guide: https://www.digitalocean.com/community/tutorials/how-to-use-haproxy-to-set-up-http-load-balancing-on-an-ubuntu-vps

To set up master-slave replication, I used this guide: https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-in-mysql

To set up ProxySQL load balancing and read/write splitting, I followed this guide: https://proxysql.blogspot.tw/2015/09/proxysql-tutorial-setup-in-mysql.html and to a lesser extent http://www.proxysql.com/blog/configure-read-write-split

I’m sure there’s still some tuning and ironing out to do to make things super smooth, but the mechanics work as expected.

http://ha.travnewmatic.com/ loads a page that shows which webserver you’re using. If you refresh like crazy and the IP doesn’t change, that’s to be expected. Sessions stickiness is implemented so if you access the page within a certain interval, you’ll continue to talk to the same webserver.

http://ha.travnewmatic.com/session.php has more information about the cookie you’re getting. Both this and the previous URL are products of the Digital Ocean guide on setting up HAProxy. Its set using the “Cookie insert method.”

http://ha.travnewmatic.com/wp/ is a vanilla WordPress install sending data through the ProxySQL server to both the master and slave databases.

This screencap shows the unbalanced send/receive traffic going to and from the master and slave MySQL servers:

There were a few F5’s between issuing those two commands. Queries to the master went up by 14, while queries to the slave went up by 259.

Current limitations: In this topology, ProxySQL represents a single point of failure. This guide describes a setup involving multiple ProxySQL servers, and a Galera database cluster. While this was mentioned but not required for my assignment, this setup makes use of virtual IP’s (which at the moment I do not fully understand). The idea is that you can specify a single IP in the web app (i.e. the host that the web application looks to for database services), but depending on what’s happening, that virtual IP could refer to different hosts. The goal is that, in the event of one ProxySQL server going down, the other one takes over, but the web app continues to have database access like nothing happened.

The other limitation involves the master/slave setup and how my ProxySQL server does query routing. I’m not sure if this can be conditional (ex. if the slave goes down, use the master for everything). If either the master or slave go down, nothing works (though this could be partially mitigated by having multiple slaves, but if the master goes down, I’m screwed). Galera seems to be an appropriate solution to this problem, though I haven’t played with Galera yet. In my setup the slave is meant to be mainly read-only, and the master gets the writes. From what little I’ve read about Galera, every db server in the cluster is a master that could be used for reads and writes.

These are technologies I could use for my own self, but considering that everything I have in Dallas is on the same physical server, investing a ton of time into HA for my own services would not be a very fruitful endeavor.

Replacing VM’s with containers would drastically reduce space requirements and allow for easier backup in the event poo does hit the fan. Downtime for me isn’t a huge deal. Having to set stuff up again is a bigger deal. So far so good though. My ZFS array is healthy and things have been ridiculously stable (knock on wood) so I’m not so worried about lack-of-availability.
June 10, 2017