Broadly, there are only three approaches to cache expiration. Each has overlapping use cases and varying granularity:
- Key Based - Everything that is cached is referenced by a unique name, this is the key. The key is composed of parts that uniquely identify a value. Often one of those parts is a timestamp or a counter, some value that will change when the data that the key represents changes. This is the finest level of cache expiration.
- Time Based — When values are cached an expiration time is set as well. When the expiration time comes around the value is dropped, and it’s time to cache a new value. Expirations can be on the order of milliseconds (system level), or years (HTTP asset level).
- Purging — This is the nuclear option. Everything is cached using the same key, forever, until all of the values are dropped. This isn’t as useless as it may sound. For example, the MemoryStore only retains values as long as the parent process is alive. Once the process stops the memory holding cached values is released. For applications with data that changes less frequently than the application restarts this is a viable option.
Regardless of how values are expired they are always referenced by a key. It all comes down to keys. The fundamental cache operation is fetch
, which checks the cache store for the existence of a key and returns the value it finds. If the key isn’t found it then generates the value, writes it to the store, and then returns the new value.
Composing Cache Keys
Cache keys are built from unique segments which change when the data they reference changes. Be aware that the precise structure of the key isn’t important. The order of segments and how they are separated is inconsequential, so long as they combine into a unique value.
Simple
:class/:id -> Post/1
The most basic cache key structure possible, only a model name and the id. The parts represent the bare information needed to retrieve a cached value. This type of key can only be expired by time or a full purge, making it of limited value in production systems.
Timestamped
:class/:id-:timestamp -> Post/1-1468239686
A timestamped key appends a model’s updated_at
timestamp in milliseconds. When the model is updated or touched, the cache is expired. This is what is generated for ActiveRecord models out of the box. How records are touched, and whether they touch associated records, is critical to proper cache expiration. In the world of ActiveRecord
it’s alright to touch yourself, touch your friends, and touch your friend’s friends.
Collection
:class_plural/:last_updated/:count -> Posts/1468239686/8
When all of the values in a collection can be cached together you have to consider more than the ids. Instead, a collection’s cache key combines the timestamp of the most recently updated model in the collection along with the collection length. If any model within the collection is updated or one is deleted the cache is expired.
Checksummed
:class/:id/:view/:checksum -> Posts/1/SomeView/a8b56bb
This includes the name of a view that generated the value and a checksum of the view at the time the value was created. This guarantees that when new fields or markup is added to the view that the cache will be expired.
Scoped
:class/:id/:role -> Post/1/staff
The role of the current user, or “scope”, is appended. In this example, staff see different values than regular users. Appending the scope prevents sensitive data from leaking to regular users. It also guarantees that both values can be cached and served up independently.
Stapled
:class/:id/:user_id -> Post/1/123
Appending the user_id
generates a new cache entry for each user. Generally this is undesirable—the cache can’t be shared at all, defeating most of the reason you are trying to cache things. However, in a system with a limited number of users where data is extremely expensive to generate, this is a viable option.