Structured Concurrency

Structured Concurrency

22 March 2024

… continued from the previous post.

Different Categories of Concurrency

When recalling the previous article, a task was subdivided into three subtasks, all working towards a common objective (specifically, merging contributors).

The subtasks are started in the main task and reach completion, yielding results, within the lifespan of that overarching task.

This pattern was named “Structured Concurrency” by Martin Sústrik1 and further examined in Nathaniel J. Smith’s “Notes on structured concurrency, or: Go statement considered harmful”2.

While I do not subscribe to every viewpoint expressed in these articles, I believe that this is at least a valid concurrency pattern. Also, it’s well-suited to Go’s hierarchical contexts and cancelation mechanisms.

Groundwork

Let’s start with a typical subtask

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
package task

import (
	"context"
	"fmt"
	"time"
)

func Task(ctx context.Context, name string, processingTime time.Duration, result error) error {
	ready := time.NewTimer(processingTime)

	select {
	case <-ctx.Done():
		ready.Stop()
		fmt.Println(name, ctx.Err())

		return fmt.Errorf("%s canceled: %w", name, ctx.Err())

	case <-ready.C:
		fmt.Println(name, result)
	}

	return result
}

We define a task.Task as a dummy workload, having a name as an identity and a processingTime after which it finishes.

The task has two properties that are important:

  • It has a synchronous API and returns an error
  • It takes a context.Context parameter and exits early when the context is canceled.

Having a context is especially important so that we don’t perform a lot of work which is irrelevant in case the overarching task already failed.

Consider this scenario: You make a query to a service that fails to respond. Eventually, a higher-level context times out, and it is important for your function to terminate to prevent any potential resource leakage.

When a request is canceled or times out, all the goroutines working on that request should exit quickly so the system can reclaim any resources they are using.3

Context

We define an doWork function which takes a higher-level context and distributes the work over three subtasks.

First, we create a sub-context of the passed context which will be canceled when we leave the scope. We pass this context to all created goroutines, ensuring we don’t leak resources.

Additionally, when we can’t complete our task (because a subtask failed) we cancel the remaining work so we can return early.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
package main

import (
	"context"

	"fillmore-labs.com/blog/structured/pkg/task"
)

func doWork(ctx context.Context) error {
	ctx, cancel := context.WithCancelCause(ctx)
	defer cancel(nil)

	var g int
	errc := make(chan error)

	g++
	go func() {
		errc <- task.Task(ctx, "task1", processingTime/3, nil)
	}()

	g++
	go func() {
		errc <- task.Task(ctx, "task2", processingTime/2, errFail)
	}()

	g++
	go func() {
		errc <- task.Task(ctx, "task3", processingTime, nil)
	}()

	var err error
	for range g {
		if e := <-errc; e != nil && err == nil {
			err = e
			cancel(err)
		}
	}

	return err
}

Since we fail task 2 we expect the following call sequence:

sequenceDiagram
    participant Main
    create participant Work
    Main->>Work: go Work(main ctx)
    Note right of Work: Create work ctx
    create participant Task1
    Work->>Task1: go Task1(work ctx)
    create participant Task2
    Work->>Task2: go Task2(work ctx)
    create participant Task3
    Work->>Task3: go Task3(work ctx)
    Note over Task1: Task1 completes
    destroy Task1
    Task1->>Work: no error (“nil”)
    Note over Task2: Task2 completes
    destroy Task2
    Task2->>Work: failed
    Note over Task3: Task3 processing
    Note right of Work: First error cancels work ctx
    Work--)Task3: cancel work ctx
    Note over Task3: Task3 interrupted
    destroy Task3
    Task3->>Work: canceled
    Note right of Work: All subtaks complete
    destroy Work
    Work->>Main: failed

Running Multiple Subtasks Concurrently

So, we run a number of subtasks in goroutines, simply counting them with g++ - which is easier to track than having a magic 3 at the top - and collect all results in the end. This is pretty simple code, and when we run it we get the expected result:

> go run fillmore-labs.com/blog/structured/cmd/structured1
task1 <nil>
task2 failed
task3 context canceled
Got "failed" error in 501ms

Try it on the Go Playground.

Also, we see that the task returns nearly immediately after the first failure (500 milliseconds) and doesn’t let the third subtask unnecessarily consume resources.

Considering the possibility of “optimization”, where we could return the error immediately without awaiting the completion of canceled subtasks: Since we expect canceled subtasks to return quickly (they can just abort), we might save very little time in error scenarios, without any gain in normal processing. Compared to the risk of a resource leak this appears to be a bad tradeoff.

Summary

Structured Concurrency is a pattern that is useful in writing correct Go programs. A context parameter und creating sub-contexts is helpful for avoiding resource leaks.

… continued in the next post.


  1. Martin Sústrik. 2016. Structured Concurrency. In 250bpm Blog — February 2016 — <250bpm.com/blog:71/↩︎

  2. Nathaniel J. Smith. 2018. Notes on structured concurrency, or: Go statement considered harmful. In njs blog — April 2018 — <vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/↩︎

  3. Sameer Ajmani. 2014. Go Concurrency Patterns: Context. In The Go Blog — July 2014 — <go.dev/blog/context↩︎