2014-08-31

Go Testing Patterns: Common Rendezvous Point for Concurrency

For as much as I try to design interfaces that avoid direct exposure of concurrency to the user, I find myself periodically testing said code (usually non-exported implementation). I'd like to outline a couple of practices that have worked for me:


Common Rendezvous Point

What the common rendezvous does is ensure that all participants have reached a common place in their execution and wait there until being told to resume. Suppose we a system under test (SUT) that involves Routine A, Routine B, and Routine C, and the SUT expects them all to be at a common point P between each of them before continuing to the core activity that produces the side effects we want to check.


(Larger Version)

How we we best achieve this in Go? The pkg/sync provides a great place to start, specifically the WaitGroup type, which provides a barrier similar to the Java Standard Library's CountDownLatch. Let's get started by modeling the SUT and the preparing components for the routines!


package rendezvous

import (
        "fmt"
        "sync"
        "testing"
)

func TestSystem(t *testing.T) {
        var prep sync.WaitGroup // The preparation rendezvous point.
        prep.Add(3)         // We have three participants.

        go func() {
                fmt.Println("Routine A starting ...")
                fmt.Println("Routine A preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine A finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine A received signal to proceed.")
                // TODO: Flesh out the rest.
        }()
        go func() {
                fmt.Println("Routine B starting ...")
                fmt.Println("Routine B preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine B finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine B received signal to proceed.")
                // TODO: Flesh out the rest.
        }()
        go func() {
                fmt.Println("Routine C starting ...")
                fmt.Println("Routine C preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine C finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine C received signal to proceed.")
                // TODO: Flesh out the rest.
        }()

        select {}  // Wait infinitely.  This aborts the runtime.  See later steps.
}
(On Go Playground)

It's time to reflect on what's going on under the hood. Let's start with some obvious questions.

Why did we use a WaitGroup as opposed to a chan? We could have created three separate channels that all routines shared but only one closed similar to this fragment:

// Outer scope
routA := make(chan struct{})
routB := make(chan struct{})
routC := make(chan struct{})

// In Routine A
close(routA)
<-routB
<-routC

// In Routine B
close(routB)
<-routa
<-routC

// In Routine C
close(routC)
<-routa
<-routB

Sure, this fragment works, but look at how needlessly verbose and fragile it is. One misimplemented handler, and boom: the whole design fails—sometimes silently!

Why do we declare the prep WaitGroup as var prep sync.WaitGroup as opposed to simply prep := sync.WaitGroup{}? We are using the zero value for the underlying struct, which is readily usable for us with this type (this topic is worthy a blog post of its own). Using the VarDecl declaration style, we can chain subsequent VarSpec in the same definition for the same type. Suppose we add a second WaitGroup; which of the following is cleaner and easier to read first, second := sync.WaitGroup{}, sync.WaitGroup{} or var first, second sync.WaitGroup? We'll return to this later.

Let's continue.

package rendezvous

import (
        "fmt"
        "sync"
        "testing"
)

func TestSystem(t *testing.T) {
        var prep, fin sync.WaitGroup // The preparation rendezvous point.
        prep.Add(3)                  // We have three participants.
        fin.Add(3)                   // We have three participants.

        go func() {
                fmt.Println("Routine A starting ...")
                fmt.Println("Routine A preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine A finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine A received signal to proceed.")
                // Perform real work here.
                fin.Done()
                fmt.Println("Routine A exited.")
        }()
        go func() {
                fmt.Println("Routine B starting ...")
                fmt.Println("Routine B preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine B finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine B received signal to proceed.")
                // Perform real work here.
                fin.Done()
                fmt.Println("Routine B exited.")
        }()
        go func() {
                fmt.Println("Routine C starting ...")
                fmt.Println("Routine C preparing ...")
                // Do expensive preparation.
                fmt.Println("Routine C finished preparation.")
                prep.Done() // Signal completion.
                fmt.Println("Routine C received signal to proceed.")
                // Perform real work here.
                fin.Done()
                fmt.Println("Routine C exited.")
        }()

        fin.Wait()
        // Validate side effects.
        fmt.Println("All routines exited.")
}
(On Go Playground)

Recall the point I made above about adding a second WaitGroup? Here it is in fin, which each goroutine uses to signal its completion of its respective work. We also replaced select {} with fin.Wait().

So what does this effectively buy us? Well, if we need to be sure for purposes of testing that all participants have reached one or more common rendezvous points before either running the SUT or validating side effects, we have a graceful protocol. This ensures both correctness and determinism!

Is the example missing anything? That depends on the SUT and its needs. One candidate could be cancellation on failure or timeout. Implementing cancellation correctly depends strongly on the underlying SUT, as I implied. There is no one-size-fits all cancellation policy for every API in Go, so I leave that as an exercise to the reader. If you're interested, this article discusses how the Java API handles this problem on a thread-level. Very generally, we could create a timeout policy as such:

import "time"

// rest elided

finished := make(chan struct{})

go func(sig <- chan struct{}) {
  fin.Wait()
  close(sig)
}(finished)

select {
  case <- finished:
  break
  case time.After(5 * time.Second):
    t.Fatal("SUT timed out.")
}

Do note that this does not include proper cancellation of your routines! Creating a timeout policy as such is not strictly needed, as the standard Go pkg/testing includes a configurable timeout for the entire test run.

Also note that this posting does not seek to enable overly complicated API design nor Rube Goldberg Machine-style tests. Only with experience and wisdom will you be able to discern between knowing when it is necessary.

In the next segment, we'll talk about concurrency level in tests.

follow us in feedly

No comments :

Post a Comment

 

None of the content contained herein represents the views of my employer nor should it be construed in such a manner. . All content is copyright Matt T. Proud and may not be reproduced in whole or part without expressed permission.