Phoenix in Production with systemd

In this post, I'm going to explore how we can leverage systemd to ensure our apps are kept running even following a power outage or whole system crash.

Running Elixir in production

For those not familiar with running an Elixir application there are two main ways that people use in the community, one is to use a straightforward PORT=4001 MIX_ENV=prod mix phoenix.server. This approach is very simple and straightforward, but has some requirements that may not suit everyone, the server, for example must have Erlang and Elixir installed.

The other major method is to leverage Erlang releases, this what we'll be doing here.

Why do we care about releases?

Much of the world has gone a little crazy about Docker, using self contained, immutable containers for packaging their apps. Interestingly, Erlang has long supported releases, which does a significant subset of what folks are trying to achieve with Docker. You can think of a release in the same category as a self contained binary - it brings with it everything it needs to run.

Generally a release contains the Erlang runtime, though like most things, it's configurable. This is important because it can be troublesome if you build a release on a different platform to where it will run. If I build a release on OSX, it won't run on Ubuntu. Some folks get frustrated by this, but I think it's pretty reasonable.

A release is essentially a collection of parts built into your application. If your factory built a car with a petrol engine, we shouldn't be surprised if it doesn't run on diesel!

The first great struggle

Many of us have embraced 12 Factor as a way of ensuring our applications are readily configurable and scalable. One aspect of this is using environment variables to configure aspects of our applications at runtime.

It often comes as a surprise to folks new to the deploying Elixir applications, that their environment variables are not respected when they come to run a release. The issue is that config files are evaluated at compile time. I'm not going to rehash the good information available at plataformatec, so read for yourself.

Environment variables and edeliver

I've been relying very heavily on edeliver for our deploys. Edeliver leans heavily on relx and provides the deployment part of releases. It's a great and very straightforward tool.

In the process of getting our application deployable I found a big challenge making environment variables available to the system.

The challenge comes because environment variables you may have exported in, say, .bashrc, aren't available when edeliver executes commands for you. One solution to this is to add your environment variables to .profile , which will make them available when edeliver executes commands. For more information have a look at the configuration section of the edeliver documentation.

The trouble with upstart

Trying to deal with jobs that require environment variables is a non trivial problem with upstart and there's not a clear and agreed convention to resolve the matter. There's a ton of advice, hacks and strategies to get it working, but none of them made me feel warm and fuzzy.

As it stands currently, the Phoenix documentation only discusses upstart. They do share one way of dealing with environment variables. Below is the example provided.

[Unit]
Description=Runner for My Phoenix App
After=network.target

[Service]
WorkingDirectory=/opt/path_to_my_phoenix_app
EnvironmentFile=/etc/default/my_phoenix_app.env
ExecStart=/opt/path_to_my_phoenix_app/bin/my_phoenix_app start
ExecStop=/opt/path_to_my_phoenix_app/bin/my_phoenix_app stop
User=ubuntu
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

As you can see, it's possible to provide environment variables, but often we'd want this configuration to be part of an Ansible script or some other version controlled system configuration tool. This makes handling of sensitive data, like passwords, troublesome.

A real world problem

I recently dealt with an odd and intermittent fault with one application I work on. The problem presented with the Postgrex component of our application being unable to find our database hosted on AWS. The issue was difficult to reproduce and was always resolved by simply stopping the release and starting it again. No other intervention necessary, no server reboots, nothing.

The error showing was Postgrex.Protocol (#PID<0.1511.0>) failed to connect: ** (DBConnection.ConnectionError) tcp connect: non-existing domain - :nxdomain. This error seemed like something Erlang / Elixir should be able to recover from, but alas it did not.

On trawling the system logs we discovered that the problem only presented itself on reboot, which was a great clue. It seemed like a kind of race condition sometimes exists where the Elixir application is started before all of the networking aspects the OS have been initialized. This caused problems because our databse is not hosted on the same machine and is located via AWS DNS.

Systemd to the rescue

One of the nice things about systemd is you can specify the order in which things should happen in a declarative manner.

This file is placed at /etc/systemd/system/my_phoenix_app.service

[Unit]
Description=Runner for My Phoenix App
After=network.target

[Service]
WorkingDirectory=/opt/path_to_my_phoenix_app
EnvironmentFile=/etc/default/my_phoenix_app.env
ExecStart=/opt/path_to_my_phoenix_app/bin/my_phoenix_app start
ExecStop=/opt/path_to_my_phoenix_app/bin/my_phoenix_app stop
User=ubuntu
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Let just call out a few interesting things here.

In our [Unit] definition we can pass the After directive, ensuring our network is ready to go before we try to run
We use the EnvironmentFile directive to setup all our environment variables for the service
We can specify which User the code is executed with
We use the RemainAfterExit, which means we don't have to run the applications using foreground rather than start

The particularly nice thing, in my opinion, is the ability to put our environment variables into a file. This gives the ability to securely store credentials on the server without checking them into version control. Alternatively, it provides a nice single source of truth for committing to whatever credential management system / tool / package you're using.

In order to recognize the service we must execute systemctl daemon-reload after creating the file.

Once this has been done we can do a one off start systemctl start my_phoenix_app.service, or start it permanently with systemctl enable my_phoenix_app.service

Here's an example of a .env file that might be used by a phoenix / ecto application. Take particular note that we set RELX_REPLACE_OS_VARS=true, without this our release would not try to evaluate the environment variables. Update: Thanks to David Kuhta for pointing out that if you're using Distillery, rather than exrm, use REPLACE_OS_VARS=true instead

This file lives at /etc/default/my_phoenix_app.env and is reference from the service definition.

HOME=/path_to_release

SECRET_KEY_BASE=
DB_PASSWORD=
SENDGRID_API_KEY=
DB_NAME=
DB_USER=
DB_HOST=
DB_PASSWORD=
HONEYBADGER_API_KEY=
RELX_REPLACE_OS_VARS=true