Wednesday 24 August 2011

Pattern matching as an optimisation method, and a means of insight

One feature of Haskell that I particularly like is its pattern-matching scheme. To route code down the correct codepath, Haskell uses pattern matching against data types and values, instead of extensive flow-control statements. This makes the code much simpler to read, as your code merely concerns itself with calling the correct functions, with the correct handler cases, and the underlying Haskell compiler deals with the flow-control minutiae.

I've increasingly found myself using a similar approach in C++ and HLSL using features such as polymorphic functions. A good example comes from optimisation: frequently, you find cases in the code where someone has written a "kitchen sink" function. This function internally handles many cases, and accordingly takes many parameters.

Frequently, many of these parameters are called with specificliteral values or constants. For example, many parameters may be zero which disable entire codepaths inside the function. For me, this is a bad code smell. But why? Many people would argue that since the code is branched away, what's the big deal? Especially if it means we can have just one definition of a function. Well, branches are very expensive on modern processors. On contemporary consoles, a floating point branch can yield a 27-cycle penalty. Not to mention all the preamble and branch set-up code.

I've therefore found myself increasingly providing multiple specialised versions of these functions, which omit some of the parameters that are permanently called with 0.0f or 1.0f and tailoring the code accordingly. Obviously this saves a lot of needless computation, branching and I$ misses.

What's interesting about this approach is that once you've performed a few of these refactorings, you start to expose patterns in the code and its structure that you previously were unaware of. You may find that your major axis of variation lies not in your chosen object hierarchy composition, but elsewhere. You may have a multitude of objects that add little value at a higher level, but now a large set of lower level functions that process the objects differently. Sometimes it makes more sense to structure and divide your code along the lines of what processing it does, not what you think the object "is", as that is the greater source of variation.

1 comment:

  1. I do most of my coding in C# these days, which while an imperative language like C/C++/Java, it does now have a lot of functional elements, including closures, lambda support and passing functions. After learning a little Haskell/Lisp/F#, I find myself drifting toward that approach as well, leaving most of my objects as simple DTOs and just sending them through different processes. It feels a lot cleaner.

    ReplyDelete