TIL: Parsing YAML with “foreign” tags in Elixir

How to read YAML with custom (Ruby specific tags) created by RoR.

red, gems, ruby
Photo by OpenClipart-Vectors on Pixabay


RoR allows you to store arbitrary data in columns by converting the data to YAML first and then store the string. If you need to access this data from Ecto/Elixir you might end up with data types (e.g. DateTime) in the YAML string that can’t be directly parsed by Elixir.

So given an “offensive” YAML

---
last_modified: !ruby/object:DateTime 2021-08-31 09:21:04.000000000 Z

which results in

"---\nlast_modified: !ruby/object:DateTime 2021-08-31 09:21:04.000000000 Z\n"

yields a parsing error when parsed YamlElixir/yamerl:

iex(13)> string = "---\nlast_modified: !ruby/object:DateTime 2021-08-31 09:21:04.000000000 Z\n"
"---\nlast_modified: !ruby/object:DateTime 2021-08-31 09:21:04.000000000 Z\n"


iex(14)> YamlElixir.read_from_string(string)
{:error,
 %YamlElixir.ParsingError{
   column: 16,
   line: 2,
   message: "Tag \"!ruby/object:DateTime\" unrecognized by any module",
   type: :unrecognized_node
 }}

You can either provide your own module to deal with this data type (see yamerl issue 26 for more details) or simply ignore unknown tags using the ignore_unrecognized_tags option:

iex(15)> YamlElixir.read_from_string(string, ignore_unrecognized_tags: true)                       
{:ok, %{"last_modified" => "2021-08-31 09:21:04.000000000 Z"}}

Comments

There are no comments yet.


Read next