Home >>
ICFP 2017
Sun 3 - Sat 9 September 2017 Oxford, United Kingdom

Improving STM Performance with Transactional StructsFri 29 Sep 2017

I am Michael Walker, a Ph.D student at the University of York working on concurrency testing. This post is about a talk from the Haskell Symposium this year, Improving STM Performance with Transactional Structs, by Ryan Yates and Michael L. Scott. It’s about reducing indirection in mutable data structures.

Software Transactional Memory in Haskell

Software transactional memory (STM) makes it easy to write correct concurrent programs in Haskell. It works by using the type system to isolate transactional effects from other effects, so that only actions which can be safely undone and repeated can be used in a transaction.

Transactions are atomic. When executed, they see a consistent snapshot of memory. Consider this concurrent program which doesn’t use transactions:

main = print =<< test

test = do
  r1 <- newMVar 1
  r2 <- newMVar 2
  forkIO (void (swapMVar r1 3 >> swapMVar r2 4))
  print =<< rsum r1 r2

rsum r1 r2 = do
  a <- readMVar r1
  b <- readMVar r2
  pure (a + b)

test first creates two mutable variables, then forks a thread which changes their values, and then executes the rsum action. rsum reads both variables and returns the sum. If we execute this program using dejafu, a library I wrote for testing concurrent programs, we can see all the possible results:

> resultsSet defaultWay defaultMemType test
fromList [Right 3,Right 5,Right 7]

Now let’s rewrite that to use STM, but this time make the swapping and the summing atomic:

test = do
  r1 <- atomically (newTVar 1)
  r2 <- atomically (newTVar 2)
  forkIO (void (atomically (writeTVar r1 3 >> writeTVar r2 4)))
  atomically (rsum r1 r2)

rsum r1 r2 = do
  a <- readTVar r1
  b <- readTVar r2
  pure (a + b)

As both writes are within the same atomic transaction, we no longer encounter the case where r1 is 3 and r2 is 2:

> resultsSet defaultWay defaultMemType test
fromList [Right 3,Right 7]

Atomicity makes writing correct concurrent programs and data structures much simpler.


A TVar contains a reference to another value. If you have a TVar Int, the memory situation looks like this:

TVar Int  --->  Int  --->  Int#

If you have a data structure containing TVar values, this indirection quickly blows up:

data TPair a = TPair (TVar a) (TVar a)

TPair Int  --->  TVar Int  --->  Int  --->  Int#
           --->  TVar Int  --->  Int  --->  Int#

You can reduce the indirection by wrapping everything up inside the same TVar, which helps a little:

TVar (Int, Int)  --->  (Int, Int)  --->  Int  --->  Int#
                                   --->  Int  --->  Int#

But the indirection inside the TVar is unavoidable: there are no strict unboxed TVars.

Indirection reduces memory locality, which causes cache misses and makes the performance of programs using these concurrent data structures suffer. This is a shame, because easy STM is definitely one of the killer features of Haskell.

Transactional Structs

Indirection is the problem that this paper addresses. It proposes a new transactional primitive, the TStruct, which contains a fixed number of word and pointer fields (determined at creation time); rather than just the one pointer field of a TVar. The programmer can use this flexibility to implement structures with less indirection.

For example, our mutable pair from the end of the last section could be implemented like so:

TStruct Int Int  --->  Int  --->  Int#
                 --->  Int  --->  Int#

The (Int, Int) in the middle has vanished, removing one indirection. If we want a pair of strict unboxed ints, we can do even better:

TStruct Int# Int#  --->  Int#
                   --->  Int#

Down from five indirections to two!

The paper discusses the implementation of a number of data structures using TStruct and shows promising benchmark results compared to TVar-based implementations. The authors are working on an implementation based on a recent GHC proposal by Simon Marlow for data structures with mutable fields.

This is an interesting new concurrency primitive for Haskell, and hopefully an implementation will land in GHC some day.