B

Multithreaded Node.js Applications

Is it worth it?

T+B

Who are We?

Thomas Hunter II
@tlhunter
Bryan English
@bengl
T

Multithreaded Javascript

(2021)

B

We work for Datadog. And we're hiring!

B

Concurrency vs Parralelism

  • Concurrency: Tasks are run in overlapping time.
  • Parallelism: Tasks run at exactly the same time.
Concurrency Parallelism
Quick show of hands...
Which model is the JS Event Loop?
B

Five lies that programmers believe about JavaScript

(Yes, we've heard these or variations on them fairly recently.)

B

Lie #1

“JS is single-threaded.”

B

Lie #2

“I need to scale to N processes if I have N cores.”

B

Lie #3

“I only need to give it 1 core or less.”

B

Lie #4

“You can't do any multithreaded programming in JS.”

B

Lie #4

“You can't do any multithreaded programming in JS.”

[sad duck noises]

Lie #5

B

“Processes are the only way to get true parallelism.”

T

Some Basics

T

Fun Fact, your Hello World is multithreaded

$ ps -M -p 12345
  PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
40046 s005    0.0 S    31T   0:00.01   0:00.03 node hello.js
40046         0.0 S    31T   0:00.00   0:00.00
40046         0.0 S    31T   0:00.00   0:00.00
40046         0.0 S    31T   0:00.00   0:00.00
40046         0.0 S    31T   0:00.00   0:00.00
40046         0.0 S    31T   0:00.00   0:00.00
40046         0.0 S    31T   0:00.00   0:00.00
          
  • Around seven threads by default
  • Includes things like v8, gc, libuv thread pools
T

Worker Threads vs Child Process

  • Threads are lighter weight (~6MB vs ~32MB)
  • Can't share obj references with either approach
  • Can't really share memory between processes
    • At least with JavaScript alone
  • OS schedules work in either case
T

Why use threads?

  1. Your application's performance would benefit from parallelism
  2. Your system has additional cores available


If these conditions haven't been met then you shouldn't use threads.
T

Quick API Overview

  • on('message') / postMessage()
    • Bi-di communication, queued in event loop
  • new SharedArrayBuffer()
    • Basically an ArrayBuffer, but...
    • It can be referenced from multiple threads
  • Atomics.*
    • Singleton
    • Coordination and data manipulation
B

Multithreaded Patterns

B

Patterns Overview

Message Passing: Can be done entirely using the .on('message') and .postMessage() methods.

Shared Memory: Still requires message passing but then uses SharedArrayBuffer and Atomics.

Hybrid: Uses message passing for synchronization while sharing data using SharedArrayBuffer.

B

Pattern: Message Passing

  • Simplest pattern, familiar event pattern
  • postMessage() and on('message')
  • Comes with serialization overhead
  • Events queued up in the event loop
  • Comparable to libuv or clicking a DOM element
  • Least performant way to use multiple threads
B

Pattern: Shared Memory

  • Most difficult pattern, unfamilar for JS devs
  • SharedArrayBuffer and Atomics.*
  • Need to use low level locks to coordinate
  • Comes with risk of data tearing / interleaving
  • Maximum performance and pitfalls
B

Pattern: Hybrid

  • Read / write data to SharedArrayBuffer
  • Use postMessage() for synchronization
  • Middle ground for performance and complexity
T

Real-world Use-Cases

T

Use-Case: Horizontal Scaling

  • Imagine a reimplementation of cluster
  • Main thread passes requests to workers
  • Each thread handles entire request
  • Main "router" thread simply round-robins
  • Entirely message passing based
T

Use-Case: Off-Thread Rendering

  • Main thread has an HTTP framework e.g. Fastify
  • Render templates on worker threads e.g. React
T

Perf: Off-Thread Rendering

  • Simple web app that does Mustache rendering
  • Rendering happens in main or worker thread
  • Autocannon suggests ~25.5k r/s vs ~40.5k r/s
  • Event Loop delay (setImmediate hrtime latency):
B

Use-case: Divide and Conquer

  • Also called map/reduce
  • Game of Life: Break map into smaller chunks
  • Each thread handles a subset of map calculations
  • SharedArrayBuffer(s) + Atomics
B

Game of Life:
One Thread

B

Game of Life:
Six Threads

T

Further Consideration

T

Performance

  • Every application and environment is unique
  • Cannot test on MacBook Pro and confidently ship
  • You must test your own app in real prod env
We found a situation where an app ran fine in prod but it would lock up in staging. Staging only had 100 millicores per app while prod had multiple cores.
B

About Serialization

  • Serialization / deserialization overhead
    • e..g. JSON serialize in main and msgpack later
    • structuredClone, JSON.stringify, all slow
  • dd-trace can't simply send all data off-thread
    • All our state is on the main thread
  • Off-thread state storage
    • Keep model in second thread
    • Statsd aggregation worker, send ints to crunch
T

Code Complexity Concerns

  • Heterogeneous threads are hard to reason about
  • Having many locks is also tricky
  • No builtin Mutex (yet), have to roll your own
  • No shared objects means lots of "writing C in JS"
  • It may be worth it for performance
  • Should you really be doing it in JS at that point?
B

The Future

B

The Future

TC39 Structs Proposal!

  • Structs, Shared Structs, Mutex, SharedArray
  • Can't sneak in objects, null prototype
  • Access props in an unsafe{} block, linting
  • Multithreaded JavaScript v2 ;)
B
T

That's it

Presentation
Code Samples

Keep an eye out for Multithreaded JavaScript 2nd ed!