Avoiding Atomic Angst in Elixir
You’ve started using Elixir’s Phoenix Framework, and you’re loving it. You’re learning to embrace the Domain-Driven concept of Bounded Contexts, and you’re setting yourself up for success by crafting well-defined internal APIs within your Phoenix Contexts. Then you stumble across an unexpected snag.
You need a function to gracefully accept both string and atom-based keys
But very quickly, things get awkward. Maybe there’s a function you’re relying on that
requires atom-based keys. Or maybe you find yourself stuck passing mixed string & atom-based keys
to Ecto.Changeset.cast, only to get this
error…
** (Ecto.CastError) expected params to be a map with atoms or string keys, got a map with mixed keys: %{:baz => "qux", "foo" => "bar"}
Okay, no problem. We’ll just translate the string-based keys in our map to be atoms, right…?? 🤔 Absolutely, we will. But as it turns out, there is a little bit of trickiness involved. Don’t worry, though, we’ll manage this together just fine!
First things first: String to Atom translation…
Ignoring for a moment that we’re dealing with a collection of things (a map of keys & values),
how would we translate a single string to an atom? Well, there are a few ways. Fire up iex in
your console to play along…
…the unsafe way
Perhaps most straightfoward is to use
String.to_atom, like so:
iex> String.to_atom("foo")
:foo
Alternatively, you could interpolate the value into a string preceeded by a colon…
iex> string_value = "foo"
iex> :"my_#{string_value}"
:my_foo
While both of these methods technically work, they’re unsafe when handling values supplied from any untrusted source—e.g. your users or a third-party system. The above methods are dangerous because atoms in the Erlang virtual machine are not garbage collected; once they’re created, they’re never destroyed!
You should never translate data from external sources into atoms using the above methods because doing so opens the door to denial-of-service attacks: A malicious user or a misbehaving system could send millions of unsupported params. Every new param processed would result in the creation of a new atom—up until the maximum number of atoms is reached. After that, the Erlang VM will crash with a “no more index entries in atom_tab” message (meaning the VM couldn’t create another atom). This is most definitely not the way.
…the safe way
The safe way to translate strings to atoms in Elixir is to use
String.to_existing_atom.
iex> _ = :foo
iex> String.to_existing_atom("foo")
:foo
Phew—so much safer!! 😅 Note that these atoms need to already exist, and it’s generally recommended that they exist in the module where this translation is being attempted.
Second things second: Updating the Map
Here’s an example of a map we’d like to translate to fully atom-based keys…
params = %{
"foo" => "bar",
:baz => "qux"
}
The Tests
We want to enumerate the map’s keys and values, translating any string-based keys into atoms along the way. First, we’ll set up our tests that might look like the following (for now, we’ll just focus on the last one; feel free to implement the first two and any others you’d like).
defmodule MyApp.UtilsTest do
use ExUnit.Case, async: true
describe "atomize_keys" do
test "accepts string-based keys"
test "accepts atom-based keys"
test "accepts mixed keys" do
params = %{"foo" => "bar", :baz => "qux"}
assert %{foo: "bar", baz: "qux"} = MyApp.Utils.atomize_keys(params)
end
end
end
The Implementation
This will get us very close…
defmodule MyApp.Utils do
def atomize_keys!(params) do
params
|> Enum.map(fn {key, value} -> {String.to_existing_atom(key), value} end)
|> Enum.into(%{})
end
end
With this, we’re mapping over each key/value pair and safely translating string keys to atoms.
Since Enum.map returns a list of key/value tuples,
we used Enum.into to transform that list back
into a map. However, something’s not quite right, and our test fails with the following:
** (ArgumentError) errors were found at the given arguments:
* 1st argument: not a binary
If you’re new to Elixir, the “not a binary” part is probably confusing. For now, you can think of
binaries in Elixir as being strings, though that’s not exactly right. The more accurate explanation
is that all strings are binaries, but only UTF-8-encoded binaries are strings
(see here).
Essentially, this error is telling us that we didn’t pass a string to
String.to_existing_atom.
Indeed, one of the keys that we passed was already an atom. Oh, right!
For our second attempt, we’ll only translate strings, and we’ll leave atoms be:
def atomize_keys(params) do
params
|> Enum.map(fn
{key, value} when is_binary(key) -> {String.to_existing_atom(key), value}
{key, value} when is_atom(key) -> {key, value}
end)
|> Enum.into(%{})
end
Yay, this time our tests are passing! 🎉
(As a sidenote, did you know that anonymous functions in Elixir could
have multiple function heads, complete with pattern matching and guard statements? Cool, right?!
In this way, anonymous functions can act like Elixir’s
case statements.)
Turns out we can actually simplify this ever so slightly. There’s an arity 3
version of Enum.into that accepts a
transform function as the third argument, allowing us to iterate over the collection of params,
translate them, and collect them back into a map all in one go…
def atomize_keys(params) do
Enum.into(params, %{}, fn
{key, value} when is_binary(key) -> {String.to_existing_atom(key), value}
{key, value} when is_atom(key) -> {key, value}
end)
end
This is great! There are other ways you might see the same thing accomplished, too.
For example, Enum.reduce would do the trick,
but it’s more verbose than Enum.into and doesn’t
add anything beneficial for the moment, so we’ll skip it. Same thing with the alternate syntax
for comprehensions, so we’ll skip
them for now, too.
Fantastic! But what about unsupported params?
Oh, good catch! If you pass params that include strings for which there is not already an existing atom,
String.to_existing_atom
will raise the following error:
** (ArgumentError) errors were found at the given arguments:
* 1st argument: not an already existing atom
There are a few ways you might handle this:
- Don’t. If the “let it crash” concept is reasonable for you here, perhaps you might leave this as is.
-
Wrap each call to
String.to_existing_atomwith try/rescue. You could then collect the keys that fail and report the invalid keys back to the user. However, you probably don’t want this! Curious outsiders could pass all sorts of keys to your system and figure out which ones were invalid (i.e. not atoms in your code) and which ones were valid (i.e. atoms that exist in your code). We’d prefer to keep our implementation details private, thank you very much! - Require an explicit “permitted” list of keys, and return an actionable error tuple if invalid keys are provided. Let’s pursue this option…
First, we’ll update our tests to expect an {:ok, translated_map} tuple on success or an
{:error, bad_keys} tuple on failure. We’ll also be passing a permitted list of param keys as a
new, second argument.
defmodule MyApp.UtilsTest do
use ExUnit.Case, async: true
describe "atomize_keys" do
test "accepts mixed keys" do
params = %{"foo" => "bar", :baz => "qux"}
assert {:ok, %{foo: "bar", baz: "qux"}} = MyApp.Utils.atomize_keys(params, [:foo, :baz])
end
test "returns an error tuple for unsupported keys" do
params = %{"nope" => "foo", "yep" => "bar", "no_way" => "baz"}
assert {:error, invalid_keys} = MyApp.Utils.atomize_keys(params, [:yep])
expected_invalid_keys = MapSet.new(["nope", "no_way"])
actual_invalid_keys = MapSet.new(invalid_keys)
assert MapSet.equal?(expected_invalid_keys, actual_invalid_keys)
end
end
end
The implementation at this point is going to look more complex. Here’s an overview in case it feels overwhelming:
- Establish the full set of permitted keys (both strings and atoms).
- Divide the provided params into two groups: valid & invalid params.
- If the list of invalid params is empty, return the success tuple. Otherwise, return the error tuple.
def atomize_keys(params, permitted_list) do
allowed =
permitted_list
|> Enum.flat_map(fn
key when is_atom(key) -> [key, Atom.to_string(key)]
key -> raise(ArgumentError, "`permitted_list` must be atoms, received #{inspect(key)}")
end)
|> Enum.into(MapSet.new())
params
|> Enum.split_with(fn {key, _} -> MapSet.member?(allowed, key) end)
|> case do
{valid, []} ->
translated_params = Enum.into(valid, %{}, fn {key, value} -> {atomize(key), value} end)
{:ok, translated_params}
{_valid, invalid} ->
invalid_keys = Enum.map(invalid, &elem(&1, 0))
{:error, invalid_keys}
end
end
defp atomize(key) when is_atom(key), do: key
defp atomize(key) when is_binary(key), do: String.to_existing_atom(key)
This is nice! It goes the extra mile in allowing consumers of our function to provide meaningful feedback as they see fit.
If you winced when you saw Atom.to_string/1,
don’t worry! It’s safe to translate atoms to strings because strings are indeed garbage collected;
it’s translating strings to atoms that you need to be cautious about.
We used a MapSet to track allowed params because under
the hood the membership lookup is performed on a map. For larger sets of params, it is far more
efficient to look up a key in a map versus scanning an entire list for it.
In all honesty, you’re not likely to have terribly many keys coming in as params anyway, so depending on your scenario, the MapSet might not be strictly necessary. On the other hand, the MapSet is more semantically accurate—it captures what we’re after better than a list (a set will not contain duplicates, while a list might). But it’s also a much more conservative approach in that it will generally work well across the board. It works well for a small number of params, and it works far better for larger numbers of params. If a consumer were to pass an exceedingly large number of params, the List version would take an exceedingly long time to return, while the MapSet equivalent would remain relatively fast.
To Conclude
With that, we’re done. There are all sorts of variations on how we might go about this, but we’ve done good work, so we’ll stop for now. The most important lesson? Don’t translate untrusted data into atoms!
Subscribe!
Don’t miss my next post. Drop your email in the box below and get it straight in your inbox…