Aug 10, 2016
Table of contents:
A couple of weeks ago we looked at working with Elixir processes. Processes are the basic unit of concurrency in Elixir. They provide isolation and allow us to build fault tolerant applications.
Elixir is a functional programming language, and so this means there is no state. In order to maintain state you need to pass it from one function to another.
Processes are a bit like objects in object-oriented programming languages. Processes communicate with each other via message passing, and like objects, can be thought of as a way to hold onto state.
This means that you can keep hold of state for a long time, possible forever, when encapsulating that state as a process. And what is even more interesting, anyone who knows the pid of the process can interact with that state via message passing.
In today’s tutorial we will be looking at working with state and Elixir processes.
As I mentioned in the introduction, in functional programming languages, in order to maintain state you need to pass it from one function to another. This is exactly how you store state using processes in Elixir.
A couple of weeks ago we looked at Understanding Recursion and Tail Call Optimisation in Elixir.
When calling a function is the last thing that happens, Elixir can just jump to the function instead of adding to the stack. This makes calling a function recursively very efficient. You could do this forever without consuming any additional memory.
So to maintain state in an Elixir process, you simply continuously call a loop function, passing the state each time the function is called.
Inside the loop function you deal with any messages that are sent to the process. Sometimes these messages will cause the state to be updated to a new value. Once the message has been handled we can simply pass the state into the next call to the loop function.
Hopefully that makes sense. I know this is quite a bit different to how you would normally store state in a programming language like Ruby. But stick with it, I’m sure it will all come together by the end of this tutorial.
To illustrate how we can use processes in Elixir to store state we’re going to make a stateful map. This is basically going to be a simple map that we can add key value pairs to by sending messages to the process.
Here is the basic usage of what we’re going to build:
pid = StatefulMap.start()
# PID<0.63.0>
First we kick off the process. This creates a new pid that we must use in order to send messages to the process.
StatefulMap.put(pid, :hello, :world)
Next we can add a key value pair to the map by using the put/3
function. This requires that we pass the pid as the first argument.
StatefulMap.get(pid, :hello)
# :world
And finally we can get a value by it’s key by using the get/2
function. Again we need to supply the pid in order to send the message to the process.
So with that little overview out of the way, let’s take a look at creating this StatefulMap
module!
The first thing we need to do is to provide a way to start a new process. If you remember back to Working with Processes in Elixir, we can do that using the spawn/1
function:
defmodule StatefulMap do
def start do
spawn(fn -> loop(%{}) end)
end
end
In the code above I’ve wrapped the spawn/1
function in a start/0
function. This is the public api of this module to begin the process. As we saw earlier, you would call the start/0
function and keep a copy of the pid that is returned:
pid = StatefulMap.start()
# PID<0.63.0>
The spawn/1
function accepts a function and returns a pid. In this case I’m passing a function that will call another function called loop/1
. The only argument to loop/1
is an empty map. This is our initial state.
When the process is started from the start/0
function, a new process will be spawned and it will immediately call the loop/1
function. We can now implement this function on the module:
def loop(current) do
new =
receive do
message -> process(current, message)
end
loop(new)
end
This function receives the current
state from the start/0
function. When the process is first created this will be an empty map.
Inside the loop/1
function we use the receive
block we saw in Working with Processes in Elixir.
When a message is received it will be handled by the process
function. This will return the new
state, which is then passed back into the loop/1
function to start the loop again. This will just sit and loop waiting for messages to be received. The Erlang Virtual Machine has been specifically created to do this kind of work, so you don’t have to worry about it like you would in other programming languages.
Next we can add the public interface methods that allow us to interact with the stateful map. In this case we need to be able to put and get items from the map.
The public interface functions are simply implemented as functions of the module. Here is the put/3
function:
def put(pid, key, value) do
send(pid, {:put, key, value})
end
As you can see, the put/3
function is simply wrapping the send/2
function to hide the fact that we are interacting with another process.
First we require the pid so we know which process to send the message to. The second argument is a tuple with the first key of :put
. This allows us to pattern match (Understanding Pattern Matching in Elixir) in the receive
block so we can handle this request correctly (Multi-clause Functions with Pattern Matching and Guards in Elixir).
Next we can add the get/2
function:
def get(pid, key) do
send(pid, {:get, key, self})
receive do
{:response, value} -> value
end
end
Again in this function we require the pid
and the key
of the value that we would like to be returned. However, this time we need to send a message to the process and wait for the response. In the put/3
function we can just fire and forget.
Inside of the tuple we are calling the self/0
function. This will pass the pid of the current process so the process that is holding the state knows where to send the message back to.
Next we need a receive
block that will accept the response. In this case we simply need to return the value
.
Now that we have the public interface functions in place we can add implementations of the process/2
function to handle the requests.
If you remember back to the loop/1
function. Whenever this process receives a message, the receive
block will pass the message to the process/2
function. We can now use pattern matching and multi-clause functions to deal with each message request as a separate function.
First up we can handle the :put
message:
defp process(current, {:put, key, value}) do
Map.put(current, key, value)
end
In this function we receive the current
value of the state of the process and the tuple from the message. Inside the function body we can use the Map.put/3
function to add the key value pair to the map. This will return a new map that will be automatically returned from the function.
Next up we can implement the :get
message handler:
defp process(current, {:get, key, caller}) do
send(caller, {:response, Map.get(current, key)})
current
end
In this function we are using the Map.get/2
function to get the key
from the current
state map. We then send it as a tuple back to the caller
pid.
Finally we need to return the current
map from the function so the loop/1
function from earlier can continue to loop the state to keep it active.
Here is the full code of the module:
defmodule StatefulMap do
def start do
spawn(fn -> loop(%{}) end)
end
def loop(current) do
new =
receive do
message -> process(current, message)
end
loop(new)
end
def put(pid, key, value) do
send(pid, {:put, key, value})
end
def get(pid, key) do
send(pid, {:get, key, self})
receive do
{:response, value} -> value
end
end
defp process(current, {:put, key, value}) do
Map.put(current, key, value)
end
defp process(current, {:get, key, caller}) do
send(caller, {:response, Map.get(current, key)})
current
end
end
Fire up iex
and have a play with this module for yourself. I think seeing the code work will help your understanding of what is going on and how Elixir is maintaining the state.
Storing state in Elixir processes is a bit mind bending to begin with because you need to be able to image how the state is stored in another process. The process that has the state is simply calling a function continuously and passing the state to itself.
This is how you keep state in a functional programming language. Inside that function we can listen for messages to interact with the state.
In today’s example we have seen how we can read the state and mutate it by passing the new state into the function. It can also be a bit confusing to get your head around which functions are called in the caller or the receiver processes.
When I first learned about storing state in Elixir processes I created a couple of modules like the one we have looked at today to understand how it would be implemented.
I think this is often the best approach. What seems hazy in your mind suddenly becomes very clear once you are working in code.