The Ruby standard lib ships with a YAML Parser called Psych. But serializing and deserializing data seems not as obvious as if you are using JSON.
To safely write and read YAML files you should use Psych#dump
(String#to_yaml
) and Psych.safe_load
(YAML.safe_load
):
data = {'key' => 'value'}.to_yaml
=> "---\nkey: value\n"
YAML.safe_load(data)
=> {"key"=>"value"}
Unfortunately you might encounter a few pitfalls which are not obvious in the first place. All of them are a side effect that you can not configure Psych#dump
to only write safe data.
Pitfall 1: Psych::DisallowedClass
Psych#safe_load
only whitelists the following classes: TrueClass
, FalseClass
, NilClass
, Numeric
, String
, Array
and Hash
. All other classes will raise an exception unless you whitelist them. Maybe it is a good idea to add Symbol
, Date
and Time
to that list, but other classes could also make sense.
data = {foo: 'bar'}.to_yaml
::YAML.safe_load(data)
Psych::DisallowedClass: Tried to load unspecified class: Symbol
::YAML.safe_load(data, [Symbol])
=> {:foo=>"bar"}
Pitfall 2: Psych::BadAlias
Psych#dump
will create aliases if you reference the same object more than one time. By default this is disabled by Psych#safe_load
. If you use the default whitelist you will not encounter the issue, but for "more complex" classes (e.g. Time) Psych#dump
will optimize the result.
time = Time.now
data = {foo: time, bar: time}.to_yaml
=> => "---\n:foo: &1 2019-11-08 11:28:34.834180510 +01:00\n:bar: *1\n"
::YAML.safe_load(data, [Symbol, Time])
Psych::BadAlias: Unknown alias: 1
::YAML.safe_load(data, [Symbol, Time], [], true) # This sym
=> {:foo=>2019-11-08 11:28:34 +0100, :bar=>2019-11-08 11:28:34 +0100}
A note
Note that both these options are there for a reason:
Allowing to deserialize symbols can expose an application to a DOS attack (since symbols are not garbage-collectable).
Parsing aliases allows "YAML bombs" that also constitute a DOS attack.
You have to choose if this is acceptable risk for your use case.
new safe_load API
Starting with psych 3.1.0 the safe_load api got more userfriendly by replacing positional arguments with keyword arguments:
if Gem::Version.new(Psych::VERSION) >= Gem::Version.new('3.1.0.pre1')
::YAML.safe_load(input, permitted_classes: PERMITTED_CLASSES, permitted_symbols: PERMITTED_SYMBOLS, aliases: true)
else
::YAML.safe_load(input, PERMITTED_CLASSES, PERMITTED_SYMBOLS, true)
end