The agonizing necessity of cached calculations

edA‑qa mort‑ora‑y - Apr 17 '17 - - Dev Community

Language support for caching sucks. Despite an ever present need to cache data and calculations, most languages require the programmer to do all the work. There's no way to track dependencies, no way to know if the cache is valid, no way to know if the cache is even useful. In this article I'll look at some of the key problems.

This article is primarily focused on caching calculations, not the caching of data. Some of the problems and solutions apply to both, but they are somewhat different in nature.

Why cache?

In Fuse we do a lot of calculations. We need to check hit test bounds, calculate curve bounds, produce local-to-world transform matrices and calculate render bounds. I'll even consider the drawing of elements to be a calculation. I'm sure I'm missing something; the point is that we have a lot to do.

On their own none of these individual calculations is a problem. En masse however, for every node in the tree, they take a lot of CPU time. Caching improves the frame rate and reduces energy usage.

We also cache inherited properties like IsVisible and IsEnabled. We don't do this for efficiency reasons, but to generate events when these values change at each node. Tree-based variables and events is another avenue to consternation all on its own.

The structure of the problem

A cached calculation looks something like this:

var cache_branch = merge( measure(leaf), measure(cone) )
Enter fullscreen mode Exit fullscreen mode

Let's assume for a moment that measure and merge are both pure functions (the result depends solely on the input arguments). This means cache_branch is valid until either leaf or cone change. But how do we know when they change?

There are two basic approaches. Either we subscribe to a leaf_changed event:

defn update_cache_branch = -> {
    cache_branch = merge( measure(leaf), measure(cone) )
}

leaf_changed.subscribe( update_cache_branch )
cone_changed.subscribe( update_cache_branch )
Enter fullscreen mode Exit fullscreen mode

Or we add a leaf_version and check it each time we calculate:

get_cache_branch = -> {
    if leaf_version != cached_leaf_version or cone_version != cone_cached_version {
        cache_branch = merge( measure(leaf), measure(cone) )
        cached_leaf_version = leaf_version
        cached_cone_version = cone_version
    }
    return cache_branch
}
Enter fullscreen mode Exit fullscreen mode

Both approaches unfortunately require that whoever changes leaf do something more: either emit the changed event or increment the version. I'll just refer to the event based approach now for simplicity.

The source of defects

That example is nearly trivial yet we can already see the coming wave of defects. Either the cache forgets to subscribe to events, or the producer fails to emit an event.

Nonetheless, to see the true scope of the problem, let's make it more complicated. We'll replace measure with calc_bounds that calculates the render bounds of a child node, where leaf and cone are such nodes (elements in a UI tree for example).

var cache_branch = merge( calc_bounds(leaf), calc_bounds(cone) )
Enter fullscreen mode Exit fullscreen mode

It would no longer make sense to subscribe to leaf_changed as we don't actually know what variables in the node are used by calc_bounds. Nor would we want to start subscribing to many different events here, it's error-prone and inefficient.

Instead we pass on the responsibility for the caching to the node itself:

var cache_branch = merge( leaf.bounds, cone.bounds )
Enter fullscreen mode Exit fullscreen mode

We can subscribe to a simple bounds_changed event on each of the nodes. It keeps this level simple, but it's pushed the burden a level down.

Each node needs to cache its own bounds and emit events when it updates. It may sound simple, but something like bounds involves a lot of variables: position, size, transforms and visibility. A change in any one of those variables must invalidate the cache somehow.

Events are expensive

Events are never free, and that's a problem. Does the cached render bounds really need to subscribe to events for all the variables of interest? Consider now that hit test bounds and layout also have cached values. Every node in a UI tree would need to subscribe to a lot of events to keep it's caches in order. The management, and dispatch, of these events, adds even more overhead.

Worse, these cached values aren't always used. It would be pointless to calculate a value, thus subscribe to events, unless something actually wanted to use the value. It's actually quite easy to build a caching system that ends up performing worse than if there we no caches at all!

In Fuse we use a third approach to alleviate this. Instead of dynamic events we hard-code some virtual functions for key caches. Things like InvalidateRenderBounds and OnRenderBoundsInvalidated. All cached values are calculated and cached lazily: after invalidation they are not calculated again until actually used.

This does wonders for efficiency, but at the expense of complexity.

Compiler help

There's no avoiding the need for these caches. It's just unfortunate there isn't much language support to help. All this work feels mundane and repetitive. Each variable we use a variable the cache ends up having the same basic structure of either subscribing to messages or checking versions. Each time we set a variable we have the same basic emitting of events or incrementing versions.

This all feels like something a compiler could be doing for us in an efficient and safe manner. I'm not clear yet on what the syntax should be. Well, that's not true, the ideal syntax would be something like:

defn get_cache_branch = -> {
    return lazy_cache_calc( merge( measure(leaf), measure(cone) ) )
}
Enter fullscreen mode Exit fullscreen mode

Where lazy_cache_calc sets up everything needed for cache: the events and subscriptions, or versions and checking, on the complete tree of dependencies. Though it seems like I might be asking for a bit much, this is essentially what reactive programming does. It's just usually reserved for higher level programming.

I'll have to investigate further.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player