Haskell Performance Patterns

Johan Tibell

February 15, 2012

Caveats

Think about your data representation

Unpack scalar fields

Always unpack scalar fields (e.g. Int, Double):

data Vec3 = Vec3 {-# UNPACK #-} !Double
{-# UNPACK #-} !Double
{-# UNPACK #-} !Double

Use a strict spine for data structures

data Map k a = Tip
| Bin {-# UNPACK #-} !Size !k a
!(Map k a) !(Map k a)

(Note the bang on the Map k a fields.)

Specialized data types are sometimes faster

data Tree a = Leaf | Bin a !(Tree a) !(Tree a)
data IntTree = IntLeaf | IntBin {-# UNPACK #-} !Int !IntTree !IntTree

Inline recursive functions using a wrapper

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
map :: (a -> b) -> [a] -> [b]
map f = go
where
go [] = []
go (x:xs) = f x : go xs

Inline HOFs to avoid indirect calls

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs

g xs = map (+1) xs -- map is recursive => not inlined

Use strict data types in accumulators

If you're using a composite accumulator (e.g. a pair), make sure it has strict fields.

Allocates on each iteration:

mean :: [Double] -> Double
mean xs = s / n
where (s, n) = foldl' (\ (s, n) x -> (s+x, n+1)) (0, 0) xs

Doesn't allocate on each iteration:

data StrictPair a b = SP !a !b

mean2 :: [Double] -> Double
mean2 xs = s / n
where SP s n = foldl' (\ (SP s n) x -> SP (s+x) (n+1)) (SP 0 0) xs

Haskell makes it cheap to create throwaway data types like StrictPair: one line of code.

Use strict returns in monadic code

return often wraps the value in some kind of (lazy) box. This is often not what we want, especially when returning some arithmetic expression. For example, assuming we're in a state monad:

return $ x + y

creates a thunk. We most likely want:

return $! x + y

Beware of the lazy base case

Functions that would otherwise be strict might be made lazy by the "base case":

data Tree = Leaf
| Bin Key Value !Tree !Tree

insert :: Key -> Value -> Tree
insert k v Leaf = Bin k v Leaf Leaf -- lazy in @k@
insert k v (Bin k' v' l r)
| k < k' = ...
| otherwise = ...

Since GHC does good things to strict arguments, we should make the base case strict, unless the extra laziness is useful:

insert !k v Leaf = Bin k v Leaf Leaf  -- strict in @k@

In this case GHC might unbox the key, making all those comparisons cheaper.

Beware of returning expressions inside lazy data types

safeDiv :: Int -> Int -> Maybe Int
safeDiv _ 0 = Nothing
safeDiv x y = Just $ x / y -- creates thunk

Summary