End the Chaos, Get with the Program: Python Logging to Syslog, and Filtering with RSyslog

This article describes how to use Python's logging library to send logs to syslog. Then, using rsyslog, a fancier syslog, we produce a log of pre-filtered output.

I have a process that runs a few times an hour, and it spits out results to STDOUT, which cron dutifully emails to me. As you might imagine, I get thousands of emails which I never read.

I needed to send this information to a log, and one that's not filled with extraneous information.

Rsyslog is the standard logging daemon in Debian-based distros, and it's like a magical, difficult-to-use, all-in-one log routing daemon. Not only does it support BSD syslog's facility.priority model, but it can filter the content of the log line.

(It can do a lot more, but we won't get into that.)

Python has a elaborate logging library, logging, that has features that overlap with rsyslog, almost to the point where rsyslog is superfluous, but to keep complexity out of the code, we're going to use just a few features.

Adding Logging to the Python code

logging is a library that lets you piece together a logging solution. Here's the code extracted from the top of my utility script:

import logging
import logging.handlers
from logging.handlers import SysLogHandler

sysl = SysLogHandler(address='/dev/log', facility=SysLogHandler.LOG_USER)
sysl.setFormatter(logging.Formatter('remote_db_runner: %(levelname)s - %(message)s'))
logger = logging.getLogger('remote_db_runner')
logger.info("remote_db_runner decorator logging started")

SysLogHandler sends the log lines to a syslogd server. The named parameter address='/dev/log' causes the program to communicate via a Unix domain socket with the log server.

logging.Formatter sets a template for the log line. logging.getLogger creates a new logger. addHandler adds a handler to the logger; you can add more than one.

Then, once it's all set up, it writes a message to the logger, rsyslogd.

Making a New Log File with Rsyslogd

In Debian, the files in /etc/rsyslog.d/ that end in .conf are used to configure rsyslogd. Create a new file, /etc/rsyslog.d/company.conf:

*.* -/var/log/company.log

This configuration logs everything to company.log.

Then restart the service: sudo service rsyslogd restart

You should see a new file, /var/log/company.log with all the log messages on the system going to it. If it's not too busy, great. You can run your Python script and see if the log message appears. If it does, great.

Now, we need to change the configuration so it only logs lines that contain "remote_db_runner:". Change the company.conf file to:

# log specific lines to company debug log
:rawmsg,contains,"remote_db_runner:" -/var/log/company.log

Delete company.log, restart rsyslogd, and then look at the log file after running the script. It should only show the new lines that we're logging from the Python script.

(The configuration we used conflicts a little with the docs, but reading the IETF standard, RFC5424, hints at why we must match "rawmsg" instead of "msg", and why we use "contains" instead of "startswith". The logging standard delivers data to the logging server as a data record, not simply a string.)

With this config file, we can add lines to include other log lines produced by our code, if we are careful to format the log lines via logging.Formatter().



Python logging