Elixir In-Memory Cache


Here's a way to create an in-memory cache in Elixir. Fun!

Use case

I thought I needed Redis. But, like many other things, Erlang OTP can handle that for you. Simplify the stack!

Erlang (and thus Elixir) has what's called Erlang Term Storage (ETS). It's an in-memory store. When you create a cache table there, it'll hang around as long as the process is active. Restarts dump the cache. If you need it longer, you can dump it to the filesystem too.

ETS is good. There's a nice lib called cachex that makes it a bit better. A nice touch here is that you can set up a cache warmer to populate the cache on an interval.

This was all great for me. The data was a bit expensive to calculate but known well in advance and changing daily.

Let's get this starty parted!

Install lib

First, specify it in mix.exs:

defp deps do
 [
   {:cachex, "~> 4.0"}
 ]
end

And in the terminal

mix deps.get

Create cache

The cache is simple. It has a table name, @cache_name. It has one accessor function that pulls the value out of the cache by key.

In this example code, we're making a cache of books by genre.

defmodule MyProj.BookCache do
  require Logger
  @cache_name :books

  def cache_name(), do: @cache_name

  def get_books!(genre_id) do
    case Cachex.get(@cache_name, genre_id) do
      {:ok, nil} ->
        []

      {:ok, books} ->
        books

      error ->
        Logger.error(
          "Failed to get book cache by #{genre_id} #{inspect(error)}"
        )

        []
    end
  end
end

Proactive Warming

The cache is so simple because the cache has already been populated in a warmer. And it's been done so proactively -- up front, on application startup. This warmer will get all the books and put them into the cache by genre_id.

It will re-warm itself (based on interval) every 24 hours.

defmodule MyProj.BookCache.Warmer do
  use Cachex.Warmer
  require Logger

  def interval, do: :timer.hours(24)

  def execute(_state) do
    with {:ok, books} <-
           MyProj.Book.read(load: [:genres]),
         {:ok, cache} <-
           map_cache(books) do
      {:ok, cache}
    else
      error ->
        Logger.error("Unable to populate book cache #{inspect(error)}")
        :shutdown
    end
  end

  defp map_cache(books) do
    genre_ids = 
      books
      |> Enum.flat_map(fn book -> book.genres end)
      |> Enum.map(fn genre -> genre.id end)
      |> Enum.uniq()

    cache =
      genre_ids
      |> Enum.reduce(%{}, fn genre_id, acc ->
        Map.put(
          acc,
          genre_id,
          filter_books_expensive_op(books, genre_id)
        )
      end)
      |> Map.to_list()

    {:ok, cache}
  end

  defp filter_books_expensive_op(books, genre_id) do
    # $$$... a reason for the cache
    books
  end
end

Start process

We add our cache and its warmers as a child process in our application.ex supervision tree. This runs as a separate GenServer. If the cache fails to initialize and warm, app startup will fail. Put the cache process before your app process. This is nice, because we are guaranteed that the cache is warmed by the time our app starts.

defmodule MyProj.Application do
  use Application
  import Cachex.Spec

  @impl true
  def start(_type, _args) do
    children = [
      # ...
      Supervisor.child_spec(
        {Cachex,
         name: MyProj.BookCache.cache_name(),
         warmers: [warmer(module: MyProj.BookCache.Warmer)]},
        id: MyProj.BookCache.cache_name()
      ),
      # ... MyProjWeb.Endpoint,
    ]
    # ...
  end
end

Accessing

Simply run the accessor function on the cache module and retrieve what's in the cache.

books_in_genre = MyProj.BookCache.get_books!(genre_id)

Note that misses on the cache in my implementation do not try a JIT refresh of the cache.

Test

In ExUnit, in the setup function, we first set up all our data so that it's in place for the cache warmer to pull from. Then we invoke the warmer. The warmer's execute function returns the full cache result.

We take that output and call Cachex.put_many on this data set manually. This is so that it will be available at test time.

This is different from application startup time, where the Supervisor.child_spec({Cachex, name: , warmers: ...}) process starts the warmer and associates it with the specific cache by name. Here in test, we act independently of the supervision tree and must take one extra step to populate the cache with the output of the warmer. If cache warming fails, the test fails with flunk.

alias MyProj.BookCache
alias MyProj.BookCache.Warmer

setup do
  # data set up ...

  with {:ok, cache} <- Warmer.execute(%{}) do
    Cachex.put_many(BookCache.cache_name(), cache)
    :ok
  else
    error -> flunk(error)
  end

  :ok
end

Those should be all the pieces. Avoid the cache if you can. But if you need one, this approach might be nice.

Modified scenario: reactive warming

Bonus podens.

Let's say instead of proactive warming of the cache, we want reactive, or just in time, warming. When the cache is missed, it'll warm. How would thing look different there?

The cache will look different. Instead of just accessing the cache, on miss, it will warm. And when the value is set in the cache, the expire of 24 hr will be set at the same time.

MyProj.BookCache do
  require Logger
  @cache_name :books

  def cache_name(), do: @cache_name

  def get_books!(genre_id) do
    case Cachex.get(@cache_name, genre_id) do
      {:ok, nil} ->
        {:ok, cache} = MyProj.BookCache.JitWarmer.execute()

        if cache |> Enum.count() > 0 do
          Cachex.put_many(@cache_name, cache, expire: :timer.hours(24))

          Cachex.get(@cache_name, genre_id)
          |> case do
            {:ok, nil} -> []
            {:ok, books} -> books
            _ -> []
        else
          []
        end

      # same as before...
    end
  end
end

The warmer can be different too. It's not going to have to use Cachex.Warmer. It also wouldn't have to populate the whole cache for all genres' books at the same time, though it could.

Now that this is a reactive cache, it's going to be warming during client requests. The cache miss requests, then, will be slower. It might make sense to do less in the warmer, such as populate a single genre. But it might might not make material difference either.