Skip to main content

Java Virtual Threads

·5 mins

Reading my articles about Go concurrency a friend asked me whether one could something similar in Java.

Project Loom #

Since the release JDK 21 Java has virtual threads1:

    Thread.startVirtualThread(() -> {
      System.out.println("Hello, world");
    });

As an equivalent to Go’s goroutines:

	go func() {
		fmt.Println("Hello, world")
	}()

A Simple Example #

Like our experiments in Go, we implement2 a simple recursive calculation of the Fibonacci sequence:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
package com.fillmore_labs.blog.jvt;

public final class Slow {
  public static int fibonacci(int n) {
    if (n < 2) {
      return n;
    }

    var fn1 = fibonacci(n - 1);
    var fn2 = fibonacci(n - 2);

    return fn1 + fn2;
  }
}

Then call it 1,000 times:

1
2
3
4
5
6
7
8
9
import com.fillmore_labs.blog.jvt.Slow;

void main() {
  for (int i = 0; i < 1_000; i++) {
    // var queryStart = Instant.now();
    Slow.fibonacci(27);
    // var duration = Duration.between(queryStart, Instant.now());
  }
}

Running this on our good old N5105 CPU gives us:

> bazel run //:try1
INFO: Running command line: bazel-bin/try1
*** Finished 1000 runs in 1.219s - avg 1.214ms, stddev 48.555µs

Which is even a little faster3 than our Go version. Nice.

So, let’s try a naïve approach to parallelize things:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
package com.fillmore_labs.blog.jvt;

public final class Parallel1 {
  public static int fibonacci(int n) {
    if (n < 2) {
      return n;
    }

    var ff1 = new FutureTask<>(() -> fibonacci(n - 1));
    Thread.startVirtualThread(ff1);
    var ff2 = new FutureTask<>(() -> fibonacci(n - 2));
    Thread.startVirtualThread(ff2);

    return ff1.get() + ff2.get();
  }
}

Resulting in:

> bazel run //:try2
INFO: Running command line: bazel-bin/try2
*** Finished 1000 runs in 279.364s - avg 279.346ms, stddev 54.647ms

4 Minutes and 20 Seconds is a little better that what Go did, but still much slower than our single-threaded solution.

Analyzing Flame Graphs #

If we look at the flame graph of the single-threaded run:

flame graph of run 1

> bazel run //:bench1 -- -prof "async:output=flamegraph;direction=forward"
Iteration   1: 1220.789 ms/op
Benchmark             Mode  Cnt     Score   Error  Units
Bench1.measure          ss       1220.789          ms/op

We see a little time spent interpreting/compiling the program and mostly working on our Fibonacci implementation. Our naïve implementation looks like this:

flame graph of run 2

We spend a lot of time blocked on a Mutex in the JVM Tool Interface, maybe the global JvmtiThreadState_lock?

Other Approaches #

Anyway, we are not here to debug the JVM, let’s try some other approaches.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
package com.fillmore_labs.blog.jvt;

import java.util.concurrent.ExecutorService;

public record Parallel3(ExecutorService e) {
  public int fibonacci(int n) {
    if (n < 2) {
      return n;
    }

    var ff1 = e.submit(() -> fibonacci(n - 1));
    var fn2 = fibonacci(n - 2);

    return ff1.get() + fn2;
  }
}

Sharing an ExecutorService and using the ‘original’ thread to do some work improves things:

> bazel run //:try3
INFO: Running command line: bazel-bin/try3
*** Finished 1000 runs in 179.452s - avg 179.426ms, stddev 41.363ms

flame graph of run 3

3 Minutes is faster (interestingly enough we loose to go here) - but still slower that the single-threaded version.

So, let’s move parallelization to the calling function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import com.fillmore_labs.blog.jvt.Slow;
import java.util.concurrent.Executors;

void main() {
  try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 1_000; i++) {
      // var queryStart = Instant.now();
      executor.execute(() -> {
          Slow.fibonacci(27);
          // var duration = Duration.between(queryStart, Instant.now());
        });
    }
  }
}
> bazel run //:try4
INFO: Running command line: bazel-bin/try4
*** Finished 1000 runs in 349.151ms - avg 164.952ms, stddev 88.675ms

flame graph of run 4

This has a similar flame graph than the single-threaded version and is approximately 3.5 times faster.

Improve Latency #

Now let us limit the number of queued calls:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import com.fillmore_labs.blog.jvt.Slow;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;

void main() throws InterruptedException {
  try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    var numCPU = Runtime.getRuntime().availableProcessors();
    var pool = new Semaphore(numCPU);
    for (int i = 0; i < 1_000; i++) {
      // var queryStart = Instant.now();
      pool.acquire();
      executor.execute(
          () -> {
            Slow.fibonacci(27);
            // var duration = Duration.between(queryStart, Instant.now());
            pool.release();
          });
    }
  }
}
> bazel run //:try5
INFO: Running command line: bazel-bin/try5
*** Finished 1000 runs in 359.420ms - avg 1.697ms, stddev 665.871µs

flame graph of run 5

Which improves our latency from 165ms to 1.7ms.

Summary #

Exercises on how many threads can be started on a certain machine are mostly boring - this metric primarily showcases the small initial stack size of virtual threads.

Seeing Java adopt virtual threads is exciting. However, it’s unlikely that Java code will resemble Go or Erlang soon. Developing correct, efficient concurrent code is much more than just replacing one threading model with another4, also there are fundamental differences in existing (standard) libraries.

… continued in part two.


  1. Ron Pressler, Alan Bateman. 2023. Virtual Threads. In JDK Enhancement Proposals — March 2023 — JEP 444 — <openjdk.org/jeps/444↩︎

  2. The code is available on GitHub at github.com/fillmore-labs/blog-javavirtualthreads↩︎

  3. This isn’t a comparison of Go and Java, at least not in terms of performance. Java excels in benchmarks and repetitive tasks. ↩︎

  4. Alan Bateman. 2023. The Challenges of Introducing Virtual Threads to the Java Platform - Project Loom — August 2023 — JVM Language Summit 2023 — <youtu.be/WsCJYQDPrrE?t=667↩︎