09 April 2016

Considerations when scaling your database

Different aims of scaling

When we scale a database we aim to improve read and write performance and ensure durability and availability.  The CAP theorem imposes upon us the fact that we cannot have a system that is at once perfectly consistent, perfectly available, and perfectly tolerant of network failure.  The mathematical proof of the theorem has been peer reviewed and so rather than fighting it we should accept it as a constraint when planning a distributed database system.

Scaling Reads

Scaling read performance is fairly straightforward in most database management systems.  We create slave databases that are used exclusively to service queries that read from the database.  A master server will accept writes, and will replicate these these to the slaves.

Until the master has replicated a write the read replicas will hold slightly old data.  Also, while the master is busy replicating the write to the replicas they may have different data from each other.  This means that the ACID requirement for consistency is broken, and rather we rely on eventual consistency.  That is to say that all of our servers will eventually agree on a consistent value for the data.

By using several read replicas to respond to incoming queries the database system is able to spread the load between multiple servers.  This improves availability as a single read replica becoming unavailable does not mean that your system cannot respond to queries.
Read replica system

If all your read replicas are in the same rack then they can all be taken down by a data centre failure.  It is very easy to setup read replicas in multiple data centres if you use a cloud service provider like AWS.

Scaling Writes

Scaling write performance is a great deal more complicated.  When we scale for write performance we typically break up the database into a number of independent shards.  Each shard is managed by its own database engine and doesn't share CPU, memory, disk, or other resources with other shards.  This is often called a "shared nothing" architecture.

There are a great many technical challenges that arise in this problem space and this has given rise to a number of vendor solutions that aim to reduce the complexity.  Oracle offers the commercial MySQL cluster package.  The free Galera cluster package works with MariaDB, MySQL, and PostgreSQL.  MongoDB supports sharding out of the box and has commercial tooling available to make it simpler to manage.

In a sharded database we break up the data into subsets and give each shard authority for looking after a set of our data.  Careful thought must be given as to how to shard our data.  If a query requires information from several shards then our database manager must collect the information it needs from multiple shards, collate this, and return the result.  This is obviously a lot more expensive than querying a single shard.