cult3

Add full-text search to an Elixir Phoenix application

Mar 23, 2023

Table of contents:

  1. How is this going to work?
  2. Setting up the project
  3. Setting up NimblePublisher
  4. Adding Haystack
  5. Adding the Search LiveView
  6. Conclusion

Yesterday I introduced Haystack, a full-text search engine written in Elixir.

In today’s article, I want to show you how easy it is to add Haystack to an Elixir Phoenix application.

You can find a working copy of the code in this tutorial at elixir-haystack/phoenix-live-view-example .

How is this going to work?

A common way to build Elixir Phoenix applications that might benefit from Haystack is to use the excellent NimblePublisher package. I’m a big fan of NimblePublisher, I use it for all of my content-driven websites, including Culttt.

In this tutorial, we’re going to build a simple Phoenix application based upon NimblePublisher. I will then show you how you can integrate Haystack to add full-text search for your content.

This tutorial is going to be focused on indexing content from NimblePublisher, but in reality, the data you want to index can come from anywhere and it can be in any shape or form.

Setting up the project

The first thing we need to do is to create a new Phoenix application:

$ mix phx.new phoenix-live-view-example --app app --no-ecto

I’m passing the --no-ecto flag because we don’t want to set up a database for this project. Having a database as part of this project would kinda defeat the purpose of this blog post!

Now that we’ve got a Phoenix application, I’m going to quickly delete some of the default Phoenix stuff you get out of the box, and replace it with a simple SearchLive module so we can take advantage of Phoenix LiveView and have an interactive page to search the content. Here is the commit if you’re interested in exactly what I changed in this step.

Setting up NimblePublisher

The next thing I need to do is to install NimblePublisher and write some content. First up I need to add the dependency to my mix.exs file:

{:nimble_publisher, "~> 1.0"}

Next, run the following command in terminal to add NimblePublisher to the project:

$ mix deps.get

Next, we need to create a struct for the articles of content. Create a new directory under lib/app called articles and then create a new file called article.ex in that new directory with the following contents:

defmodule App.Articles.Article do
  @moduledoc """
  A module for articles.
  """

  defstruct ~w{id slug name body}a

  def build(filename, attrs, body) do
    slug = filename |> Path.rootname() |> Path.split() |> List.last()

    struct!(__MODULE__, Map.to_list(attrs) ++ [slug: slug, body: body])
  end
end

Each article will have an id, slug, name, and body. In the build/3 function, I’m extracting the slug from the filename of the markdown file. The attrs will come from the frontmatter of the markdown file, and the body is the markdown that NimblePublisher has automatically converted to HTML for us.

Next, we can create a new file under lib/app called articles.ex with the following contents:

defmodule App.Articles do
  @moduledoc """
  A module for managing articles.
  """

  alias App.Articles.Article

  use NimblePublisher,
    build: Article,
    from: Application.app_dir(:app, "priv/articles/*.md"),
    as: :articles

  @articles @articles
          |> Enum.reverse()
          |> Enum.with_index(1)
          |> Enum.map(fn {post, id} -> Map.put(post, :id, to_string(id)) end)

  def articles, do: @articles

  def take(ids) do
    Enum.map(ids, fn id ->
      Enum.find(articles(), & &1.id == id)
    end)
  end
end

You can think of this module as the “context” for articles within the application. It’s basically the public API of the article’s Bounded Context.

First, we set up NimblePublisher with the use NimblePublisher line. This will automatically read the markdown files from the filesystem and build a struct for each article. All of this will happen at compile time, and NimblePublisher is smart enough to know when the content needs to be re-compiled.

I’ve also added a take/1 function that accepts a list of ids and returns a list of articles. We’ll need this function to grab a list of articles from the search results of a query.

One thing to note is I’m using Enum.with_index/1 to set integer ids on the post, rather than using the slug attribute as the ref. This ensures we are storing the minimum amount of data possible in Haystack, which will reduce the memory footprint of your application.

If you open an iex terminal, you should be able to return a list of the article structs like this:

App.Articles.articles()

Here is the commit that shows me adding NimblePublisher to the repository.

Adding Haystack

Now that we’ve got a Phoenix application setup with data, we can add Haystack to the project so we can start using the full-text search.

The first thing to do is to add Haystack as a dependency to the project in your mix.exs file:

{:haystack, "~> 0.1.0"}

Run the following command in terminal to pull Haystack into the project:

$ mix deps.get

The next thing I’m going to do is to add a new search.ex file under the lib/articles directory:

defmodule App.Articles.Search do
  @moduledoc """
  A module for configuring Haystack search.
  """
end

The first thing I’m going to do is to add a new haystack/0 function that initialises a new %Haystack{} struct and configures the :animals index:

@doc """
Return the Haystack.
"""
def haystack do
  Haystack.index(Haystack.new(), :articles, fn index ->
    index
    |> Index.ref(Index.Field.term("id"))
    |> Index.field(Index.Field.new("name"))
    |> Index.field(Index.Field.new("body"))
    |> Index.storage(storage())
  end)
end

In this example, I only have a single index in my Haystack. However, the %Haystack{} struct can be used to manage multiple indexes in your application.

In the callback function, I’m adding the ref and two fields to the index. I’m also configuring the storage using the storage/0 function. Let’s take a look at that function next:

@doc """
Return the storage.
"""
def storage do
  Storage.ETS.new(name: :articles, table: :articles, load: &load/0)
end

I’m going to use the ETS storage implementation that is shipped with Haystack out-of-the-box. You’ll notice that this storage implementation has a :load attribute with a captured function. Let’s take a look at that function next:

@doc """
Load the storage.
"""
def load do
  Task.Supervisor.start_child(App.TaskSupervisor, fn ->
    Haystack.index(haystack(), :articles, fn index ->
      Articles.articles()
      |> Stream.map(&Map.take(&1, ~w{id title body}a))
      |> Enum.each(&Haystack.Index.add(index, [&1]))

      index
    end)
  end)

  []
end

The load/0 function is a way for you to build or restore data to your search index. In the example above, I’m using a Task to rebuild the data in the search index. This means that whenever the application is started, Haystack will automatically build the index asynchronously.

Alternatively, if you would prefer to build the index manually and then restore from the filesystem, you can also do that in this function. You’ll see that the return value of this function is an empty list. Whatever you return from this function will be given to the ETS storage implementation to be restored.

Finally I’m going to add a helper function for actually taking a user’s input and performing a search against the index:

@doc """
Perform a search.
"""
def search(q) do
  Haystack.index(haystack(), :articles, fn index ->
    Index.search(index, q)
  end)
end

By default, Index.search/2 will perform a match_any query. This means it will match any documents that have any of the terms against any of the indexed fields. Alternatively you can pass in a third argument to Index.search/3 of query: :match_all if you would prefer to perform a query where all of the terms needed to match all of the configured fields. Finally, you could also build your own custom %Query{} and run it manually against the index. This would give you complete control over how the query was configured.

Here is the commit where I add Haystack to the project.

Adding the Search LiveView

The final step of adding Haystack to a Phoenix application is to add a live view module for rendering a search form and results, and accepting queries from the user. We’ve already created a SearchLive module in a previous step, so let’s go ahead and finish it off.

The first thing I need to do is to add the mount/3 callback to set up a couple of default assigns:

@impl true
def mount(_params, _session, socket) do
  {:ok,
    socket
    |> assign(:q, nil)
    |> assign(:articles, [])}
end

Next, I’ll add the render/1 function. This will render the form to allow the user to submit a query, as well as list the search results that were returned from the query:

@impl true
def render(assigns) do
  ~H"""
  <div class="mx-auto max-w-2xl mt-20 flex flex-col space-y-8 bg-white p-8 shadow-sm">
    <.form :let={f} for={%{}} as={:search} phx-change="search" phx-submit="search">
      <%= search_input(f, :q, value: @q, placeholder: "Search...", autofocus: true, class: "w-full border border-gray-100 text-2xl active:border-0 active:border-gray-200 focus:border-gray-200 focus:ring-0 bg-gray-50") %>
    </.form>
    <div class="flex flex-col space-y-8">
      <div :for={article <- @articles} class="flex flex-col space-y-1">
        <h2 class="text-2xl font-medium text-gray-900">
          <%= article.name %>
        </h2>
        <%= raw(article.body) %>
      </div>
    </div>
  </div>
  """
end

If you’re familiar with Phoenix and Tailwind, this should all look very familiar. Finally, when the user types into the search input box, we need to handle the "search" event. Here’s how we would do that:

@impl true
def handle_event("search", %{"search" => %{"q" => q}}, socket) do
  articles =
    q
    |> Articles.Search.search()
    |> Enum.map(& &1.ref)
    |> Articles.take()

  {:noreply, assign(socket, :articles, articles)}
end

In this handle_event/3 callback, we’re taking the q from the params and passing it to the search/1 function we defined earlier. This will return a list of search results. We use those search results to get the original articles using the take/1 function. Finally we set the articles as an assign on the live view.

Here is the commit where I finish off the SearchLive module.

Now if you spin up the application with the following command and open the application in a browser, you will be able to perform full-text searches on the content:

$ iex -S mix phx.server

Conclusion

And that’s it! You’ve successfully added full-text search to a NimblePublisher powered Elixir Phoenix application with only a handful of lines of codes.

As I mentioned earlier, your data doesn’t have to come from NimblePublisher, it could come from anywhere. The most important thing to take away from this tutorial is how incredibly easy Haystack makes adding full-text search to any Elixir Phoenix application.

If you’re looking to add full-text search to your Elixir application, Haystack might be the perfect tool for the job.

Philip Brown

@philipbrown

© Yellow Flag Ltd 2024.