02 June 2017

Logging at scale in PHP

Image: Pixabay
Logging can be an expensive operation if you're writing to disk.  It's probably not a problem for your application at low volumes, but as you start having even just tens of thousands of users on your site at the same time the I/O becomes expensive.

An obvious way to help reduce the I/O is to turn debug logging off when you're in production.  You only want to log messages that you need to keep and not a full trace of every persons visit to the site.  This does reduce the logs usefulness if you do later try to use them to debug a problem.

If you're using the Monolog package then you should know about the "fingers crossed" handler.  In this mode of operation Monolog will only output debug log lines if there is a warning or error log too.  This lets you log less stuff in general cases but still have full debug logs in the event of a problem.

🛈 Laravel and Symfony both use Monolog for their logging.  

Another problem that emerges when you start scaling your application is that inevitably you're going to be running your application in more than one place.  How do you sensibly log when a users logs are split up over several different servers or in multiple service containers?

The answer is to centralize your logs.  By sending your logs to one central logging service you'll be able to make sense of them.  Another advantage to this sort of service is that they'll offer functionality like being able to search and aggregate the logs.

The key point to consider when you're looking at centralizing your logging is how to get your logs to it.  You do not want your PHP application to need to contact other servers to send logs!

Linux already has the rsyslog program which offers a way to asynchronously send logs to another server.  Any solution that you decide on should either use this service or offer the same functionality.

Setting it up is very simple, you configure PHP to write logs to "syslog" instead of to a file.  Then you configure rsyslog to forward your syslog on to the other service.  Most of the logging platforms will have scripts to help you do this, but it is really very easy to set up.

For example, adding this line to your rsyslog configuration will cause it to forward all of your syslogs to a Graylog host over UDP:

    *.* @graylog.example.org:514;RSYSLOG_SyslogProtocol23Format

There are several commercial logging systems like Loggly and Splunk.  These are great options because you won't need to worry about maintaining the logging service and can just use it as a paid service.

If you would prefer to use your own servers then Graylog is an excellent open source alternative.