Skip to content

Benchmarks

This page is the high-level benchmark summary across Node, Deno, and Bun, with a Tokio close-up for the small-payload fast path. For runtime-specific detail, see the dedicated pages for Node, Deno, Bun, and Tokio.

These benchmarks quantify communication overhead and scaling behavior in specific benchmark shapes. They are not a direct prediction of end-to-end application speedup.

  • IPC overhead between main and worker threads
  • End-to-end message latency as batch size increases
  • Payload sensitivity (primitive, structured, and binary types)
  • Throughput at growing payload sizes (up to 1 MiB)
  • Heavy-task scaling and parallel efficiency under CPU-intensive workloads

Interpretation rule: “x faster” here means faster for this benchmark setup (payloads, batching, CPU profile), not universally x faster for any app.



Tokio close-up

Primitive handoffs, copy-heavy bytes, and a separate Arc reference

This close-up mixes two kinds of communication cost: primitive handoff slices and one copy-heavy Uint8Array slice from the fairer default tables. The Arc<Vec<u8>> card is separate and should be read as a Tokio shared-ownership reference, not as the default apples-to-apples byte comparison. Markers to the left are faster, markers to the right are slower.

number f64 · batch 1

Primitive handoff only: Node and Bun are faster, Tokio stays ahead of Deno, and all four remain in the 6-22 µs range.

  • Tokio 13.01 µs baseline
  • Bun + Knitting 7.35 µs 44% faster
  • Node + Knitting 6.63 µs 49% faster
  • Deno + Knitting 21.54 µs 66% slower

number f64 · batch 10

Still a handoff-cost slice: Bun, Node, and Deno stay below 20 µs, while Tokio averages 27.50 µs.

  • Tokio 27.50 µs baseline
  • Bun + Knitting 13.41 µs 51% faster
  • Node + Knitting 17.28 µs 37% faster
  • Deno + Knitting 11.97 µs 56% faster

Uint8Array 512 KiB · batch 100

This is the copy/clone-heavy byte path: Bun and Node edge Tokio on average, while Deno is a bit slower and all four land in the same 22-25 ms band.

  • Tokio 23.06 ms baseline
  • Bun + Knitting 22.33 ms 3% faster
  • Node + Knitting 22.66 ms 2% faster
  • Deno + Knitting 25.26 ms 10% slower

Arc<Vec<u8>> ref · 512 B · batch 100

Separate Tokio shared-ownership reference, not the default byte benchmark: Bun stays within 6% of Tokio, while Node and Deno are slower.

  • Tokio 79.51 µs baseline
  • Bun + Knitting 74.78 µs 6% faster
  • Node + Knitting 97.23 µs 22% slower
  • Deno + Knitting 123.11 µs 55% slower

Primitive and copied-byte rows come from the fairer default comparison. The Arc row is included as a separate small-payload shared-ownership reference.

For primitive calls and low binary payloads, Knitting stays in the same microsecond tier as Tokio in these benchmark shapes. Read that as handoff cost: wakeups/signaling plus copying or cloning very small payloads, not a claim that every payload shape matches Tokio. That matters because it shows the shared-memory fast path is not just “good for JavaScript” — it can stay competitive with a Rust baseline when coordination dominates.

The IPC combined chart compares Knitting against worker postMessage, websocket, and HTTP across runtimes.

  • At one message, Knitting is typically about 3.5x-6x faster than worker postMessage.
  • Against websocket, Knitting is usually around 3.5x-15x faster.
  • Against HTTP, Knitting is usually around 10x-57x faster for the same benchmark shape.
Combined IPC benchmark chart across runtimes

This chart shows how latency changes as the number of messages per iteration increases.

  • Across runtimes and batch sizes, Knitting is generally around 3x-45x faster than worker baselines.
  • The advantage is strongest in small-to-medium batch ranges where message overhead dominates.
  • At very large batches, absolute latency rises for all approaches, but Knitting still keeps lower overhead.
Latency line chart across runtimes

These two charts compare type-dependent overhead at count 1 and count 100.

  • Count 1: Knitting remains low-latency even for structured and binary payloads.
  • Count 100: batching increases throughput while preserving strong relative performance.
  • Heavier payload classes (e.g. large objects/arrays, errors) cost more in every runtime, but Knitting keeps the best profile overall.
Combined types benchmark chart for count 1 Combined types benchmark chart for count 100

From the 1048576 B row (avg) in each runtime’s *_call-growth-batch result (batch=64), one-way transfer throughput is:

RuntimeString (GB/s)Uint8Array (GB/s)
Node1.447.50
Deno3.375.78
Bun11.8616.21

Quick takeaways:

  • Bun is clearly fastest at 1 MiB in this batched benchmark shape, for both text and binary payloads.
  • Node’s Uint8Array path improves sharply under batching and overtakes Deno on binary throughput, while Deno keeps the stronger string path.
  • This complements the IPC/latency charts by showing behavior when payload size, not just call count, grows.
  • At this size, ceilings are strongly influenced by memory bandwidth and runtime internals, not application logic.

Heavy-load benchmarks run a CPU-intensive prime-number workload and distribute work across extra threads.

  • Speedup grows steadily as threads are added, reaching roughly 3.5x-3.8x at +4 extra threads in these runs.
  • Efficiency remains strong under contention, staying around ~70-77% at higher thread counts.
  • Bun and Deno show slightly stronger scaling than Node in this specific heavy-load scenario.
  • This does not imply every app gets 3.5x+; I/O-heavy services often see smaller changes.
Heavy-load speedup chart across runtimes Heavy-load efficiency chart across runtimes
  • Use this page for cross-runtime trends.
  • Use benchmarks/node, benchmarks/deno, benchmarks/bun, and benchmarks/tokio for raw tables and per-runtime interpretation.
Terminal window
./run.sh

Results are written into results/.

Terminal window
./run.sh --json

JSON output is useful for plotting scripts under graphs/.