05 June 2017

Using and tuning Opcache

Image: Pixabay
PHP is an interpreted language.  This means that there is an engine which reads
your PHP and converts it to an internal format called opcode.  The engine is
then able to execute the opcode directly.

This is done when the script is run and although compiling is very fast it
still requires some computation and possibly for the file to be read off disk.
PHP uses a single-request-per-script-execution model and so in normal operation
every time your program runs it would need to be converted to opcode.

An opcode cache is one which stores the prepared opcode for a PHP file which
saves the PHP engine from having to interpret the file every time it runs.

There are several different opcocde caches and the reason I’m focusing on
OpCache is because it’s baked into default installations of current versions of
the most popular implementation of PHP and does not need the user to
install anything.

Tuning opcache


Now that we have some context about opcache and what it does lets look at
how it can be tuned.  We want to understand the trade-offs and constraints
involved in changing the settings.

You can configure the amount of memory it has available and also the number of
files that it is able to store.  There's also a setting that controls the
maximum amount of wasted memory that Opcache will tolerate.  We'll be going
into all of these in more detail.

The default values for Opcache are actually a little bit low if you're using a
framework or a lot of vendor libraries.  You should also remember that some
templating systems generate code for your views so there might be PHP files in
your application that aren't in your source control.

The aim of tuning Opcache is to avoid having it become full and to avoid having
it restart itself.  We'll look into the details of this shortly but when we're
tuning performance we'll be working within the constraints set out by these
situations.

How does Opcache decide that it is "full"?


Opcache considers itself "full" when it is unable to store more files.  It
reaches this limit either when it runs out of memory or runs out of storage
slots, or keys.

You limit the memory available to Opcache with the `memory_consumption` setting
and the number of keys or storage slots with the `max_wasted_percentage` setting.

Recycling the cache


Opcache isn't alone in this problem and other cache systems can also become
full.  Often a cache will use a system to determine which keys to discard when
it gets full.  For example, it might decide that "least recently used" piece of
data it is storing should be discarded so that it can store new information.

There are lots of these strategies but Opcache doesn't use any of them and it
never replaces anything in the cache with new information.  The only way in
which Opcache ever replaces information in its cache is to empty it out and
start from scratch.  It throws away all of its cached information and forces
PHP to recompile the files so that it can cache the results again.

So what happens if a file on disk changes?  Will Opcache remove the old entry
from the cache and rather store the new version?

No.  It won't.  It will cache the new version and mark the memory being used
to store the old version as "wasted".  The old version will still be using a slot
in the maximum number of keys Opcache can store and will still be using up
memory allocated to Opcache.

Once Opcache is full it will check the amount of wasted memory and compare it
against the configured `max_wasted_percentage` setting.

If Opcache is wasting less memory than this percentage then it will not restart.
Files that are not in the cache will never be cached and will always be compiled
when they are required.

Detecting when files change


By default Opcache is configured to make sure that the file has not changed
on disk before it returns the cached opcode.  This is great if you expect your
files to be changing, but is probably not the best option for production.

The `validate_timestamps` setting can be set to zero to prevent Opcache from
ever checking disk to see if the file has changed.  This saves on disk I/O and
any saving on any sort of I/O is always a good thing.

Of course this leads to the problem of needing to clear out your cache when you
deploy new code.  The most common solution I've seen to this is to restart the
PHP-FPM worker.  This results in Opcache being emptied out and any current
requests being terminated.  This might be fine if you don't work in a shop
that does continuous deployment and is able to schedule deployments for times
when your site is not so busy.

How could we avoid the situation where we dump all of the people using our site
and start our cache cold?

One way to work around this is to point your web-server document root to a new
location and reload its configuration.  The web-server and PHP-FPM
processes are different and so you're able to change the web-server configuration
without affecting PHP-FPM.

By doing this your web-server will start up new workers for new requests and
not terminate the workers that are busy finishing up the requests that are
currently running.

This still doesn't solve the problem of needing to clear out Opcache so that
your freshly deployed code is used instead of the code in the cache.  Luckily
we are not forced to choose between always checking for a new version or never
checking.  We're able to tell Opcache to check for a new version on disk at
a configurable interval.  When we deploy code then once that time has passed
Opcache will go ahead and cache the new version.

Of course this doesn't mean that Opcache won't need to restart.  The amount of
wasted memory will increase when the new files are cached and you'll be
filling up your memory and keys quicker than you would otherwise.

I think that's why most of the time people just restart Opcache when they
deploy new code.  It's the most simple approach and if you do it during
scheduled downtime your users won't complain as much.

Finding your opcache settings


By default the Opcache settings are contained in your php.ini file.  This might
be different in your distribution.  Remember that you can go to your shell
and use the command `php --ini` to get a list of all of the ini files that PHP
is loading.  You can pass that list into grep to find all mentions of `opcache`
to narrow the list down.

Profiling Opcache

The only way to get a look into OpCache's running state is to use PHP functions
like opcache_get_status().  This function returns an array that contains a
fair amount of detail about the state of Opcache.

It is slightly inconvenient that you need to run PHP in order to view the
state of your Opcache.  It means that you need to have your webserver serve up
a page that makes these internal calls for you.  Naturally you'll configure
Nginx to only allow access to that page only for requests from your IP but it's
still a bit annoying that you have to do this.

Opcache is shared between all of your worker pools.  I'm assuming that because
you're reading a book about scaling you're not sharing your server with other
users.  In the event that there are other websites on the server you should
double check that your host has configured things properly.  By default any
PHP script is able to access the entire cache and so is able to read your
source.

For us the most important aspect of Opcache being shared between all worker
pools is that when we allocate memory we're sharing that between everything.
In other words if you allocate 128MB of RAM to Opcache then that is the total
to be shared and not the amount per worker or pool.

Right, so we know that Opcache is shared between all the workers in all the
pools.  This means that we don't have to use our actual application to profile
Opcache.  As long as a PHP script is run in PHP-FPM it will be able to tell us
how Opcache is working for all of the other workers.

The easiest way to do this is to upload a script written by somebody else to
your server.  There are several on Github but my favourite one was written by
Rasmus Lerdorf.  It's my favourite because it's a single file application which
I find easy to drop onto a server without affecting much else.  

All of these solutions could potentially leak sensitive information about your
PHP installation.  They require you to add security to make sure that the
endpoint is not accessible to the public.  This is just another thing to go
wrong.  

An alternative is to rather use Zabbix to monitor Opcache.  Zabbix doesn't have
access to PHP so it cannot monitor Opcache directly, but it is extremely
flexible in allowing custom data imports.  You can very easily set up a PHP
script that outputs the Opcache data into a format that Zabbix can ingest.

This alternative is neater because it adds Opcache to your central server
monitoring and is less likely to vomit information about your setup to the
internet.

You'll find pre-built solutions for this on Github

In summary


Lets quickly sum up what we've been through.  We've learned that Opcache
improves the speed of your application by storing a compiled version of it
in memory.

We've learned about three settings that are most important in
performance,

  1. `memory_consumption` - which limits how much memory Opcache can use
  2. `max_accelerated_files` - which limits how many different files or versions of files Opcache can store
  3. `max_wasted_percentage` - which helps Opcache to decide whether it should restart when it is full, or rather force new files to get compiled whenever they're called.

We also looked at the `validate_timestamps` setting.  Setting this to zero
means that PHP won't need to make disk requests to see if a file has been
updated, but this can cause issues with deployment.