Benchmarks

This page is the high-level benchmark summary across Node, Deno, and Bun, with a Tokio close-up for the small-payload fast path. For runtime-specific detail, see the dedicated pages for Node, Deno, Bun, and Tokio.

These benchmarks quantify communication overhead and scaling behavior in specific benchmark shapes. They are not a direct prediction of end-to-end application speedup.

What these benchmarks measure

IPC overhead between main and worker threads
End-to-end message latency as batch size increases
Payload sensitivity (primitive, structured, and binary types)
Throughput at growing payload sizes (up to 1 MiB)
Heavy-task scaling and parallel efficiency under CPU-intensive workloads

Interpretation rule: “x faster” here means faster for this benchmark setup (payloads, batching, CPU profile), not universally x faster for any app.

number `f64` · batch `1`

Primitive handoff only: Node and Bun are faster, Tokio stays ahead of Deno, and all four remain in the 6-22 µs range.

Tokio 13.01 µs baseline
Bun + Knitting 7.35 µs 44% faster
Node + Knitting 6.63 µs 49% faster
Deno + Knitting 21.54 µs 66% slower

number `f64` · batch `10`

Still a handoff-cost slice: Bun, Node, and Deno stay below 20 µs, while Tokio averages 27.50 µs.

Tokio 27.50 µs baseline
Bun + Knitting 13.41 µs 51% faster
Node + Knitting 17.28 µs 37% faster
Deno + Knitting 11.97 µs 56% faster

Uint8Array `512 KiB` · batch `100`

This is the copy/clone-heavy byte path: Bun and Node edge Tokio on average, while Deno is a bit slower and all four land in the same 22-25 ms band.

Tokio 23.06 ms baseline
Bun + Knitting 22.33 ms 3% faster
Node + Knitting 22.66 ms 2% faster
Deno + Knitting 25.26 ms 10% slower

`Arc<Vec<u8>>` ref · `512 B` · batch `100`

Separate Tokio shared-ownership reference, not the default byte benchmark: Bun stays within 6% of Tokio, while Node and Deno are slower.

Tokio 79.51 µs baseline
Bun + Knitting 74.78 µs 6% faster
Node + Knitting 97.23 µs 22% slower
Deno + Knitting 123.11 µs 55% slower

Primitive and copied-byte rows come from the fairer default comparison. The Arc row is included as a separate small-payload shared-ownership reference.

For primitive calls and low binary payloads, Knitting stays in the same microsecond tier as Tokio in these benchmark shapes. Read that as handoff cost: wakeups/signaling plus copying or cloning very small payloads, not a claim that every payload shape matches Tokio. That matters because it shows the shared-memory fast path is not just “good for JavaScript” — it can stay competitive with a Rust baseline when coordination dominates.

IPC (combined)

The IPC combined chart compares Knitting against worker postMessage, websocket, and HTTP across runtimes.

At one message, Knitting is typically about 3.5x-6x faster than worker postMessage.
Against websocket, Knitting is usually around 3.5x-15x faster.
Against HTTP, Knitting is usually around 10x-57x faster for the same benchmark shape.

Combined IPC benchmark chart across runtimes

Latency (line)

This chart shows how latency changes as the number of messages per iteration increases.

Across runtimes and batch sizes, Knitting is generally around 3x-45x faster than worker baselines.
The advantage is strongest in small-to-medium batch ranges where message overhead dominates.
At very large batches, absolute latency rises for all approaches, but Knitting still keeps lower overhead.

Payload Types (combined)

These two charts compare type-dependent overhead at count 1 and count 100.

Count 1: Knitting remains low-latency even for structured and binary payloads.
Count 100: batching increases throughput while preserving strong relative performance.
Heavier payload classes (e.g. large objects/arrays, errors) cost more in every runtime, but Knitting keeps the best profile overall.

Combined types benchmark chart for count 1

Combined types benchmark chart for count 100

Call growth (1 MiB throughput)

From the 1048576 B row (avg) in each runtime’s *_call-growth-batch result (batch=64), one-way transfer throughput is:

Runtime	String (GB/s)	Uint8Array (GB/s)
Node	`1.44`	`7.50`
Deno	`3.37`	`5.78`
Bun	`11.86`	`16.21`

Quick takeaways:

Bun is clearly fastest at 1 MiB in this batched benchmark shape, for both text and binary payloads.
Node’s Uint8Array path improves sharply under batching and overtakes Deno on binary throughput, while Deno keeps the stronger string path.
This complements the IPC/latency charts by showing behavior when payload size, not just call count, grows.
At this size, ceilings are strongly influenced by memory bandwidth and runtime internals, not application logic.

Heavy Load: Speedup and Efficiency

Heavy-load benchmarks run a CPU-intensive prime-number workload and distribute work across extra threads.

Speedup grows steadily as threads are added, reaching roughly 3.5x-3.8x at +4 extra threads in these runs.
Efficiency remains strong under contention, staying around ~70-77% at higher thread counts.
Bun and Deno show slightly stronger scaling than Node in this specific heavy-load scenario.
This does not imply every app gets 3.5x+; I/O-heavy services often see smaller changes.

Heavy-load speedup chart across runtimes

Heavy-load efficiency chart across runtimes

Where to read details

Use this page for cross-runtime trends.
Use benchmarks/node, benchmarks/deno, benchmarks/bun, and benchmarks/tokio for raw tables and per-runtime interpretation.

Run the suite

./run.sh

Results are written into results/.

JSON output

./run.sh --json

JSON output is useful for plotting scripts under graphs/.