cult3

Adding a Table of Contents to Nimble Publisher

Jan 09, 2023

Table of contents:

  1. How is this going to work?
  2. Creating a new project
  3. Setting up Nimble Publisher
  4. Appending the Table of Contents
  5. Adding the Table of Contents links
  6. Conclusion

I really love using Nimble Publisher in my Elixir projects. Nimble Publisher is a very simple way to add Markdown publishing to your Elixir application. This means you can write your content for blog posts, articles, or web pages, and have Nimble Publisher convert it to HTML ready to be displayed online without any extra overhead or dependencies. I’ve used Nimble Publisher in all of my recent projects, including this very blog.

An important thing to consider when writing long blog posts or articles is providing a Table of Contents that shows an overview of what is covered. This is important for readers who might want to see if the content is relevant or if they want to jump to the specific section they are interested in. Providing a Table of Contents is also important for SEO and it can help your content perform better in search results.

In this tutorial we’re going to be looking at adding a Table of Contents to a Nimble Publisher blog.

How is this going to work?

So before we jump into the code, let’s take a look at how this is going to work at a high level.

Nimble Publisher reads your Markdown files that are typically stored in the priv directory of the project. These files are your individual items of content, for example, blog posts, or articles. These files are parsed and split into meta attributes and the body of the content.

The body of the content, which is written in Markdown, is then passed to another Elixir package called Earmark. This package converts the Markdown into HTML so it’s ready to be displayed on a web page.

To add a Table of Contents, we need to extract the headings from the documnt during the parsing stage and append the Table of Contents to the top of the document. We also need to modify each header of the document to add an anchor link so that the reader is able to click on an item in the Table of Contents and be automatically taken to that section of the document.

So, now that we have an understanding of what’s involved, let’s start building the Table of Contents functionality!

Creating a new project

The first thing I’m going to do is to create a new Phoenix project as an example repo you can follow along with:

mix phx.new table-of-contents --app=App

You can find the repo on GitHub, here: culttt/table-of-contents.

Next I’m going Nimble Publisher as a dependency to the mix.exs file:

{:nimble_publisher, "~> 0.1.3"}

Run the following command in terminal to pull in the dependency:

mix deps.get

Setting up Nimble Publisher

Next we’re going to need some Markdown files. I’m going to just create a single dummy Markdown file for the purpose of this tutorial, but in reality this would be your blog posts, articles, or whatever you wanted to publish to the internet. Here is a link to the markdown file I created.

At the top of the Markdown document is a section of meta attributes. In this example I’ve only included the title of the article “Hello, world!”, but you could also use this section to include author details, tags, a summary of the post, or any other attributes you’d want to include.

We then separate the meta attributes from the contents of the blog post with ---.

Create a new directory called blog under the priv directory of your project and save this file as a .md file with the name of {year}-{month}-{day}-hello-world.md. For example, my document is named 2023-01-09-hello-world.md.

Next, we need to add a couple of modules so Nimble Publisher can pull our Markdown files and convert them into structs with HTML contents.

First, create a new file called blog.ex under the lib/app directory in your project with the following contents:

defmodule App.Blog do
  @moduledoc """
  Blog posts
  """

  alias App.Blog

  use NimblePublisher,
    build: Post,
    from: Application.app_dir(:app, "priv/blog/*.md"),
    as: :posts

  defmodule NotFoundError do
    defexception [:message, plug_status: 404]
  end

  def all, do: @posts

  def get!(id) do
    Enum.find(all(), &(&1.id == id)) ||
      raise NotFoundError, "post with id=#{id} not found"
  end
end

This is the entry point to our blog functionality. First we set up NimblePublisher with the following use statement:

use NimblePublisher,
  build: Blog.Post,
  from: Application.app_dir(:app, "priv/blog/*.md"),
  as: :posts

This tells Nimble Publisher where our Markdown documents are stored, and allows us to configure how the files are converted into structs.

We also have the following in the same file:

defmodule NotFoundError do
  defexception [:message, plug_status: 404]
end

def all, do: @posts

def get!(id) do
  Enum.find(all(), &(&1.id == id)) ||
    raise NotFoundError, "post with id=#{id} not found"
end

Once Nimble Publisher has generated the structs from the Markdown files, it will save the structs in the @posts module attribute. In the code above you can see we can return all of the posts using the all/0 function.

We also have a get!/1 function that allows us to find a specific post by it’s id. If the post is not found, an exception is raised. This uses a custom exception that has a plug_status: 404 attribute. This means when this exception bubbles up, Plug (ie Phoenix in our case) will return the correct HTTP response code.

The build: Post option of the use NimblePublisher hints at what we need to do next. The Post module allows us to define how the attributes of the Markdown file are converted into the properties of the struct. Here is the implementation of the Post module we’re going to be using:

defmodule App.Blog.Post do
  @moduledoc """
  Blog post
  """

  @keys ~w{id title body date}a
  @enforce_keys @keys
  defstruct @keys

  def build(filename, attrs, body) do
    [year, month, day, id] =
      filename
      |> Path.rootname()
      |> Path.split()
      |> List.last()
      |> String.split("-", parts: 4)

    date = Date.from_iso8601!("#{year}-#{month}-#{day}")

    struct!(__MODULE__, Map.merge(attrs, %{
      id: id,
      body: body,
      date: date
    }))
  end
end

As you can see, Nimble Publisher allows us to define our own integration point so we can choose how the struct is created using the build/3 function. In this case we’re extracting the publish date and post id from the filename of the Markdown document and then we’re building the struct using the struct!/2 function.

This gives us complete control over how these structs are created and allows us to add any customisations or extra functionality that might be required.

Now, if you open up iex you should be able to see the Markdown file we created earlier as a parsed struct:

iex(1)> App.Blog.all()

Appending the Table of Contents

Now that we have set up Nimble Publisher and everything is working correctly, we need to look at adding the Table of Contents. We don’t want to have to manually add a Table of Contents to each blog post because that would be annoying to maintain, but we can generate the Table of Contents fairly easily when Nimble Publisher is parsing the contents of our Markdown files.

Nimble Publisher allows us to define our own parse/2 function that accepts the filename and the raw contents of the file. This function should return a 2-element tuple where the first element is the meta attributes, and the second element is the content.

use NimblePublisher,
  build: Blog.Post,
  parser: Blog.Parser,
  from: Application.app_dir(:app, "priv/blog/*.md"),
  as: :posts

Create a file under lib/blog called parser.ex with the following contents:

defmodule App.Blog.Parser do
  @moduledoc """
  Custom Nimble Publisher parser
  """

  def parse(path, contents) do

  end
end

The first thing we need to do is to split the contents of the file into the attributes and the body. We can just copy and paste this function from Nimble Publisher because we don’t actually need to do anything different at this stage:

defp split(path, contents) do
  case :binary.split(contents, ["\n---\n", "\r\n---\r\n"]) do
    [_] ->
      {:error, "could not find separator --- in #{inspect(path)}"}

    [code, body] ->
      case Code.eval_string(code, []) do
        {%{} = attrs, _} ->
          {:ok, attrs, body}

        {other, _} ->
          {:error,
            "expected attributes for #{inspect(path)} to return a map, got: #{inspect(other)}"}
      end
  end
end

This function will split the contents of the file and then evaluate the Elixir map that contains the meta attributes of the document.

Next, we can implement the parse/2 function:

def parse(path, contents) do
  with {:ok, attrs, body} <- split(path, contents) do
    headers =
      body
      |> String.split("\n\n")
      |> Enum.filter(&String.starts_with?(&1, "## "))
      |> Enum.map(fn original ->
        title = String.replace(original, "## ", "")
        slug =
          title
          |> String.downcase()
          |> String.replace(~r/[^a-z]+/, "-")
          |> String.trim("-")

        {original, title, slug}
      end)

    if Enum.any?(headers) do
      {attrs, append_table_of_contents(body, headers)}
    else
      {attrs, body}
    end
  end
end

First we use the split/2 function to get the attributes and the body of the document. Next we need to extract the headers from the document. In this example I only care about the second level headings for the table of contents. If you wanted to include all heading types you would need to modify how the headers are extracted.

Once we have a list of headings, we can use those headers to build a table of contents using the following function:

defp append_table_of_contents(body, headers) do
  table =
    headers
    |> Enum.with_index(1)
    |> Enum.map(fn {{_original, title, slug}, i} ->
      "#{i}. [#{title}](##{slug})"
    end)
    |> Enum.join("\n")

  "Table of contents:\n#{table}\n\n#{body}"
end

This will generate a simple Table of Contents that can be appended to the start of the document. The contents of the document is still in Markdown at this stage and so we need to generate the links of the Table of Contents as Markdown links.

Now that we have appended a Table of Contents at the start of each blog post, we now need to link each section of the Table of Contents to the appropriate header in the document using anchor links. This will allow readers to jump directly to the section of the blog post they are interested in.

To do this we can use a feature of Earmark (the Markdown processor we’re using to convert Markdown to HTML), which allows us to modify the processed Markdown file before it is converted to HTML.

First up we need to define a new module that will encapsulate our “post processing” functionality. Create a new file under lib/blog called processor.ex with the following module definition:

defmodule App.Blog.Processor do
  @moduledoc """
  Custom Earmark processor
  """
end

The Processor module should implement a 1-arity function that accepts the Earmark AST, makes the modification, and then returns the AST.

For our use-case, we want to match <H2> HTML nodes and update the contents to include an anchor link. Here is how we could do that:

def process({"h2", [], [text], %{}}) do
  anchor_id =
    text
    |> String.downcase()
    |> String.replace(~r/[^a-z]+/, "-")
    |> String.trim("-")

  {"h2", [{"id", anchor_id}], [text], %{}}
end

def process(value), do: value

We need to take the text of the element and convert it to the same slug generated earlier in the Parser module. We can use this generated slug as the anchor id of the h2 element so when the link in the Table of Contents is clicked, the browser will take the user to the relevant section of the page.

As we’re matching on the “h2” element, we also need to provide a catch-all process/1 implementation that will return the unmodified value for every other element.

Now that we have the Processor module implemented, we can add the configuration to Earmark via the use NimblePublisher declaration in the App.Blog module:

use NimblePublisher,
  build: Blog.Post,
  parser: Blog.Parser,
  from: Application.app_dir(:app, "priv/blog/*.md"),
  as: :posts,
  earmark_options: [postprocessor: &Blog.Processor.process/1]

Now, if you open up iex again and inspect the return values of the App.Blog.all() function, you will see that the body of the blog post now includes an auto-generated HTML Table of Contents!

Conclusion

Nimble Publisher is a fantastic package that provides just enough functionality whilst exposing valuable points of integration so that you can customise or add additional functionality without much effort.

In this tutorial we’ve been able to leverage those points of integration to very easily add a Table of Contents. You can see this functionality in action at the top of this very blog post.

I hope this tutorial has given you inspiration for what you could build on top of Nimble Publisher.

Philip Brown

@philipbrown

© Yellow Flag Ltd 2024.