29 June 2017

Context and caching

Image: Pixabay "Storage"
In this post I'm going to take a look at how we can improve performance by avoiding having our code run at all.

Lets assume that we have users from different parts of the world accessing our application.  We've decided to use a CDN to deploy so that people who are a long way from our server are not subjected to waiting for the network.  The CDN will be caching requests at the edge locations close to users.  It will only
pass a request to our load-balancer if the request cannot be served out of its
cache.

Having the CDN be able to respond to the user request is the best case scenario because the network distance is reduced and our infrastructure doesn't have to worry about the request at all.  The user in Hong-Kong will have their response
just as quickly as a user in London where we're hosting our server.

If we were not using a CDN we would probably want to use a caching load balancer like Varnish or Nginx, but in our case lets assume that the load-balancer just passes the request on to a server.

The request reaches a web-server which will also decide whether or not it can respond with a cached result.  If our web-server is able to respond without hitting PHP we're still doing okay for performance because web servers are pretty quick off the mark at returning cached content.  If our web server is struggling to return cached content fast enough we can just stick it onto a
bigger server or add more servers to our cluster.  In other words if we can have it so that most of our requests reaching our webserver do not reach PHP it makes scaling a lot more simple.

If the web-server cannot serve the request out of cache then it will need to pass it on to a PHP-FPM worker.  The PHP worker will read the bytecode it needs to run the program out OpCache, or it will read the file(s) off disk and compile them.  It will run the program and see if it can read parts of what it needs out of our Redis/Memcached cache store.  If it can't then it will make queries against the database.

Our caches are intended to avoid the expense of I/O operations (network and disk) and processing time (database and PHP processing).  The closer we can cache to the user the more we can save.


The problem with context


The context of a program is the minimal set of data that you need to store in order to be able to resume processing in case it is interrupted.  It's what you need to know to be able to build up the state of the application.

If I know the context of an application then I should be able to interrupt it and later rebuild it's state and allow it to continue running.  If I do so I should expect the same results that I would have had if I just left it to run.

So why should context be a problem?  Surely the ability to predict what code does allows us to do important things like writing test assertions and debugging?

In the PHP world a program's initial context comes from input and your session.   Input can be supplied through your request variables, and through cookies. We run through our code and come out with an entirely predictable and (most importantly) cacheable result.

A program that is run twice with the exact same initial context should produce the same result each time.  This means that you can confidently store the result of running the program knowing full well that the next time the program is run with that particular context you can just return the result
that you already know.

This leads to the observation that you can only cache the output of your code within the context in which it was run.

Practical example


Lets imagine an API that a client calls to list upcoming calendar events.  The client is displaying an events calendar to a user.

Because we're such nice people we've added lots of functionality to our API to make life easier for the client.  We've allowed the client to specify a starting date and an ending date so that they can display a week or a month at a time. We're also allowing the client to specify an event category so that it can  present a drop-down box for the user.  Lastly, we allow the client to specify
a sort order that they would like to receive their data in.

A call to our API might look something like this:

https://api.myapp.com/events?start=20170101&end=20170108&category=music&sortby=start&sortOrder=desc

When the call starts it has five variables in its context.  If we cache this call in Varnish/Nginx/CDN then we'll be caching the result of calling our API with an extremely specific context.

If another person wanted to see events for the same date range, same sort order, but for a different category we would need to run the PHP application and cache that response. Even a small change in the initial context means that we can't use our cache at all.

If a third user wanted to see exactly the same information as either of the other users but wanted to see it in ascending order we would need to generate the results for them specifically.

You can see that as you add users you're going to have an increasing number of possible combinations.  The fact that we're allowing arbitrary dates to be selected makes this especially problematic.

With the possibility of so many different contexts we can be pretty sure that we're going to be missing our outer HTTP caches a whole lot.


Load balancing between your server and the users device


We can write very powerful API endpoints that allow a lot of flexibility.  In our example the client would be able to get the information it needs in the format that it needs because we're doing all of the work in PHP on the server.

What if we were to delegate some of the processing responsibility over to the user device that is consuming the API?

If we could get our client to receive an unformatted list of data that they then sort and slice into the format that they need then we wouldn't need to worry about doing that on the server.

More importantly, and this is the crux, we could reduce the variability of the context in which transactions occur.  This makes it much easier to cache our results in our edge caches and avoid having our PHP run at all.

By asking the client to take on the load of transforming the data we're allowing ourselves to respond in a generic way to the request.  The client is now responsible for managing the context of the application which frees up the server to work on managing data.
This is why frameworks like ReactJS and Angular are so powerful for the web,
they make it possible to manage context for complex applications in the client.
So instead of offering all of the parameters that it previously did we could refactor our API to only allow the user to specify the number of the week for which it wants results.  The backend will calculate the date that the week starts on and return events for that week.

Obviously you would need to look at your use case when trying to generalize your API.  If your clients typically need a month at a time then sending them a month worth of data at a time might be better than asking them to make four or five requests of a week each.  If the amount of data being transferred is overly big you might look at splitting your API into an endpoint that gives a list of
events and another endpoint that gives details of events.


Cookies and caching


Both Nginx and Varnish will, in the default configuration, not cache a object coming from the backend with a Set-Cookie header present.  It is possible to override the setting, but it's preferable to rather avoid using cookies at all.

Why avoid cookies?  The short answer is that cookies are associated with context because they're being used either to provide input parameters or to indicate session variables.

Cookies are the traditional way to store a user's session identifier.  We then use the session information to identify the user who authenticated themselves previously.  Often frameworks that use cookie based sessions (like Laravel) will also set a cookie for a user who has not been authenticated.  This allows us to maintain session state for the user on the server.

The purpose of a cookie is to associate a request with a particular session state.  In other words the cookie is an indication that you're using session stores.  If you're confident that your responses can be cached because there is no context coming from the cookie or session then you can configure your caches to ignore them.

The outer caches will see that we're attaching a cookie to our response and will assume that we intend to use that cookie in the future.  If you are confident that the cookie is not part of the program context you can instruct the cache to disregard the cookie and store the response regardless.

Can a RESTful API use cookies?

There is a lot of argument about whether cookies are "RESTful" and I think it
really boils down to what you're storing in the session that the cookie
identifies.  Using JWT also requires you to use an HTTP header and a cookie
is also just an HTTP header.  I can't see any fundamental difference when
it comes to using either method to identify a user, but if you're storing
state other than user authentication in the session then that could be seen
as problematic for being RESTful.

Another reason to avoid cookies is because browsers are just one sort of client that you can expect to be using your API.  You might find that cookie based user identification is very inconvenient for somebody writing a Java application to run on Android.

If you are going to use cookies then you'll need to look at making sure that your edge caches know when to disregard them and cache a response regardless of it having a cookie attached.  To me this feels like more work than avoiding them entirely.

If you're using cookies to manage sessions then consider swapping to a different way to authenticate your user.  JWT tokens are very convenient and will let you avoid needing to worry about reconfiguring your caches.  So how can we avoid context?

Avoid session variables


You do not want to persist session state between calls.  The session state forms part of your context and will affect the results of running your program.  This complicates your caching, because you can only cache within the entire context of your transaction.  Your cached results will only be meaningful to the specific case where the user has that exact session state.

The context of your API needs to be limited to the input parameters that it gets from the HTTP request.  Let the client maintain the state of the application and let your API server worry about supplying the data that the client requests.


Dumb down your API


Don't have your API accept every parameter under the sun and serve up responses that are specific to an application context.  Let your API provide information that can be reused in different application states and cached on the edges of your network.

You might need to rewrite portions of your API and update your clients that consume them.  The performance gains you obtain from this will be significant and will dwarf any efforts you make to optimize your code.  Not running your code at all will always be faster than running optimized code.

The effect of dumbing down your API is to limit the variability of the request parameters passed to it.  By limiting the variability you increase the chance that requests will match a previously cached version.

REST


The entire aim of a RESTful API is to maintain stateless operations.

Scaleability is one of the big wins of having a RESTful API.  Sure you'll enjoy other benefits, but for the purposes of this chapter if you make your API RESTful then you'll avoid context and allow your responses to be cached.

There's a lot of focus on REST with regards to verbs and structuring your endpoints.  All of that is good but when it comes to performance don't lose sight of the goal of stateless operations.