# Knitting — full documentation > Knitting is a zero-dependency worker pool over a shared-memory IPC runtime for Node.js, Deno, and Bun. Export a function, create a pool, and call it like a normal async function — on real threads or isolated processes. ## Essentials (read this first) - Install: `npm install knitting` (the npm package is `knitting`; it is also on JSR as `@vixeny/knitting`). Requires Node 22+, Deno 2+, or Bun 1+. - A task is an exported function at module scope. Wrap it with `task({ f })` only when you want options like a timeout or an abort signal. - Tasks take ONE argument. Use a tuple or object for multiple values: `([a, b]) => a + b`. - Guard host-only code with `isMain` — workers re-import the module. - Module loading: each worker re-imports the module that DEFINES your tasks, and its top-level `import`s run in every worker (they are hoisted — `isMain` does NOT gate them). Keep tasks in a lean module separate from your server/framework code. Tasks must be `export`ed or the loader can't find them and the call silently hangs. `importTask` targets must be plain functions, not `task()` wrappers. - Create a pool with `createPool(options)({ taskA, taskB })`, then call `await pool.call.taskA(args)`. - Cleanup: `using pool = createPool(...)` disposes the pool at scope exit. `await pool.shutdown()` still exists to close it earlier or to await teardown. - Isolation: `importTask({ href, name })` keeps a task's code off the host (only the worker imports it). Set `worker.runtime: "process"` to run each worker as a separate process — including inside a bwrap sandbox or a container. - Security: for untrusted or security-sensitive code, define the task with `importTask`. The host holds only a typed wrapper and never imports or evaluates the module, so that code runs only in the worker (under its permissions), never at host scope. - Zero-copy IN: `ProcessSharedBuffer` (`knitting/shared-memory`) shares bytes across processes; `SharedArrayBuffer` and `BufferReference` (`knitting/unsafe`) move bytes to thread workers without copying. Pick by boundary — process vs thread. - Zero-copy OUT (good practice): for large binary results from a thread worker, RETURN a `BufferReference` so bytes move back instead of being copied through the transport. `knitting/utils` converts string/JSON/number ↔ `SharedArrayBuffer`. - Optimized for HTTP: `call.*()` accepts `Promise` inputs, so forward `request.arrayBuffer()` (e.g. Hono `c.req.arrayBuffer()`) straight into a task without awaiting it on the request thread — UTF-8 decode / JSON parse then happens in the worker. Ideal for SSR, JWT, and upload routes. - Workers are quiet and contained by default: in strict mode (the default) worker `console.*` does NOT reach the host — set `permission: { console: true }` to surface it. Task code cannot take the host down: `process.exit`, `process.kill`, `process.abort`, and `Deno.exit` are blocked. - Debugging goes to STDERR: pass `debug: true` to `createPool` (or set the `KNITTING_DEBUG=*` env var) to stream diagnostics, each line tagged with the worker (`host`, `w0`, `w1`, …), the runtime, and a per-worker ms timer. Select namespaces instead of all — `host` (pool/task setup), `imports` (which modules each worker loaded), `lifecycle` (worker ready / process events), `signals` (per-dispatch traffic, very chatty), `globals` (`globalThis` pollution per load phase) — via `debug: { host: true, imports: true }` or `KNITTING_DEBUG=host,imports`. The option and the env var merge; either can enable a namespace. Zero-cost when off: the logger module isn't even imported. - Payload size: dynamic payloads are hard-capped at ~8 MiB by default (over-cap calls reject with `KNT_ERROR_3`). Raise it with `payload: { maxPayloadBytes, payloadMaxByteLength }` — `maxPayloadBytes` must be `<= payloadMaxByteLength >> 3`; the buffer growth cap defaults to 64 MiB. - Cancellation & timeouts: `task({ f, timeout: { time: 100 } })` bounds a call, `task({ f, abortSignal: true })` injects an `AbortSignal`, and `worker.hardTimeoutMs` is a hard wall-clock kill for runaway CPU. - Errors are real: thrown errors and rejected promises return to the host as `Error` objects with `name`, `message`, `stack`, and the full `cause` chain. ```ts import { createPool, isMain } from "knitting"; export const square = (n: number) => n * n; export const greet = (name: string) => `hello ${name}`; if (isMain) { // `using` shuts the pool down when this block ends. using pool = createPool({ threads: 2 })({ square, greet }); const [n, msg] = await Promise.all([ pool.call.square(8), pool.call.greet("knitting"), ]); console.log({ n, msg }); // { n: 64, msg: "hello knitting" } } ``` --- # Installation URL: https://knittingdocs.netlify.app/start/installation/ Install Knitting from npm for Node.js, Deno, and Bun, and verify runtime requirements for shared-memory worker IPC. Use this page to check runtime requirements and install Knitting correctly for your JavaScript runtime. ## Requirements - Node.js 22+ - Deno 2+ - Bun 1.0+ ## Install Knitting is published on npm as `knitting`. Latest release: `0.1.55`. > Note: Knitting is also published on JSR as `@vixeny/knitting` — the same code under a > scoped name. The npm package is the simplest path on every runtime, so prefer it > unless you specifically need the JSR entry. --- # Quick Start URL: https://knittingdocs.netlify.app/start/quick-start/ Build your first Knitting worker pool, define tasks, call them like async functions, run them in parallel, and shut down cleanly with `using`. This quick start walks the whole loop: define tasks, create a pool, call tasks like async functions, and let the pool shut down cleanly with `using`. ## Introduction In a few minutes you’ll have your first tasks running in parallel. If you already know what workers are, think of Knitting as: *“workers, but with a function-call API and much lower overhead.”* ### Quick definitions A few words we’ll use a lot: - **Task**: a function the workers can run. The simplest task is just an exported function; wrap it with `task({ f })` when you want options like timeouts or aborts. - **`call.*()`**: runs a task on the pool and returns a `Promise` with the result. - **`isMain`**: a boolean that’s `true` only on the host. Workers re-import your module, so guard host-only code — like creating the pool — with it. - **`createPool()`**: starts the workers and returns a typed `call` object plus `shutdown`. The pool is also disposable, so `using` can close it for you. - **Host ↔ Worker**: the host is the process that creates the pool; workers are the threads (or processes) that run the tasks. ### Examples Now that we share the vocabulary, the snippets should feel familiar. ```ts import { createPool, isMain } from "knitting"; // A task is just an exported function the workers can run. export const greet = (name: string) => `hello ${name}`; if (isMain) { // `using` shuts the pool down automatically when this block ends. using pool = createPool({ threads: 1 })({ greet }); console.log(await pool.call.greet("knitting")); // hello knitting } ``` ```ts import { createPool, isMain } from "knitting"; export const hello = () => "hello "; export const world = (prefix: string) => `${prefix}world!`; if (isMain) { using pool = createPool({ threads: 2 })({ hello, world }); // call.hello() returns a promise; Knitting resolves it before world runs. const lines = await Promise.all( Array.from({ length: 3 }, () => pool.call.world(pool.call.hello())), ); console.log(lines.join(" ")); // hello world! hello world! hello world! } ``` ```ts import { createPool, isMain } from "knitting"; // Several tasks share one pool. Calls are promises, so you can chain them. export const double = (n: number) => n * 2; export const square = (n: number) => n * n; if (isMain) { using pool = createPool({ threads: 2 })({ double, square }); const results = await Promise.all( [1, 2, 3, 4, 5].map(async (n) => pool.call.square(await pool.call.double(n))), ); console.log(results); // [4, 16, 36, 64, 100] } ``` ```ts import { createPool, isMain, task } from "knitting"; // Wrap a function with task() when you want options like a timeout. // This call is too slow, so it falls back to the default instead of hanging. export const slow = task({ timeout: { time: 100, default: "timed out" }, f: async (name: string) => { await new Promise((resolve) => setTimeout(resolve, 1_000)); return `hello ${name}`; }, }); if (isMain) { using pool = createPool({ threads: 1 })({ slow }); console.log(await pool.call.slow("knitting")); // timed out } ``` ## Build it step by step 1. Import what you need: ```ts import { createPool, isMain } from "knitting"; ``` 2. Export your tasks. The simplest task is just an exported function, kept at module scope so workers can load it by name: ```ts export const square = (n: number) => n * n; export const greet = (name: string) => `hello ${name}`; ``` 3. Create the pool inside an `isMain` guard. Workers re-import this module, and the guard keeps them from re-running your host code: ```ts if (isMain) { using pool = createPool({ threads: 2 })({ square, greet }); } ``` `using` makes the pool close itself when the block ends — no manual cleanup. 4. Call tasks like async functions. They’re just promises, so batch them with `Promise.all`: ```ts if (isMain) { using pool = createPool({ threads: 2 })({ square, greet }); const [n, message] = await Promise.all([ pool.call.square(8), pool.call.greet("knitting"), ]); console.log({ n, message }); // { n: 64, message: "hello knitting" } } ``` 5. Shut down when you’re done. With `using`, that already happens at the end of the block: workers stop and the process can exit. When you need to control the timing — or you’re on a runtime without `using` — call `shutdown()` yourself: ```ts const pool = createPool({ threads: 2 })({ square, greet }); try { console.log(await pool.call.square(8)); } finally { await pool.shutdown(); } ``` ### A task can make its own pool For a quick script with a single task, skip the separate `createPool` call: ```ts import { isMain, task } from "knitting"; export const double = task({ f: (n: number) => n * 2, }).createPool({ threads: 2 }); if (isMain) { try { console.log(await double.call(21)); // 42 } finally { await double.shutdown(); } } ``` ## Good habits A few things that keep Knitting code clean and fast. ### Keep tasks in their own module(s) It keeps your app code tidy and lets workers load only what they need. * package.json * deno.json * src * knitting * database.ts * img_parsing.ts * jwt.ts * app/ * pages/ ### Promise inputs are awaited on the host `call.*()` accepts `Promise` inputs. Knitting resolves them on the host before dispatch, so unresolved promise state never crosses the thread boundary. That is handy in request handlers — you can pass a body read directly: ```ts app.post("/validate", async (c) => { const result = await pool.call.validate(c.req.text()); return c.json(result); }); ``` If the input promise rejects, the host call rejects and the worker never runs. ### Pick the cheapest payload that fits Smaller, flatter values move fastest: numbers, booleans, and short strings, then typed arrays and `Buffer`, then compact JSON. For bytes-plus-metadata, use `Envelope`. See [Supported payloads](/guides/payloads/). ### Start strict, open up later Worker permissions are restricted by default — sensitive paths like `.env`, `.git`, `~/.ssh`, and `/etc` are blocked, and `node_modules` is deny-write. Open only what your tasks actually need. See [Permissions](/guides/permissions/). ## Footguns The handful of things that trip people up most — most of them follow from one fact: **workers re-import the module that defines your tasks.** - **One argument per task.** `pool.call.add(a, b)` won't work — pass a tuple or object: `pool.call.add([a, b])` for `([a, b]) => a + b`. - **Guard host code with `isMain`.** Without it, pool creation (and any other host-only code) re-runs inside every worker. - **Top-level imports run in every worker.** `import` is hoisted, so it executes before any `isMain` check. Keep tasks in their own lean module so workers don't load your whole server framework. - **Export your tasks.** An unexported `task()` / `importTask()` is invisible to the worker loader, so the call just hangs — no handler is ever registered. - **`importTask` targets are plain functions**, not `task()` wrappers (that throws a `TypeError`). Put `timeout` / `abortSignal` options on the `importTask` call instead. - **Worker `console.*` is silent by default** in strict mode. Pass `permission: { console: true }` to surface worker logs. - **Can't tell what the pool is doing?** Pass `debug: true` to `createPool` (or set the `KNITTING_DEBUG=*` env var) to stream setup, import, and lifecycle diagnostics to **stderr** — each line tagged with the worker and a millisecond timer. Narrow the noise with namespaces: `debug: { host: true, imports: true }`. It costs nothing when off. - **Only supported payloads cross the boundary.** `Map`, `Set`, class instances, and functions are rejected — see [Payloads](/guides/payloads/). - **Dynamic payloads cap at ~8 MiB** by default; raise `payload.maxPayloadBytes` (and `payload.payloadMaxByteLength`) for larger ones. > Tip: Knitting ships machine-readable docs. Point your agent at > [`/llms.txt`](/llms.txt) for the essentials, or [`/llms-full.txt`](/llms-full.txt) > for every page inlined into one file — with those in context it gets the footguns > above right far more often. ## Where to go next - [Defining tasks](/guides/defining-tasks/) — `task()`, `importTask()`, timeouts, and aborts. - [Creating pools](/guides/creating-pools/) — threads, balancers, and shutdown. - [Payloads](/guides/payloads/) — what crosses the boundary, and `Envelope`. - [Performance](/guides/performance/) and the [inliner](/guides/inliner/) — when to let the host run some work too. Going further: Knitting can also run each worker as a **separate process** — even inside a `bwrap` sandbox or a container — for stronger isolation (see [Process workers](/guides/process-workers/)), and move large buffers with **`ProcessSharedBuffer`** (shared memory, no copy — see [Shared memory](/guides/shared-memory/)). --- # Defining tasks URL: https://knittingdocs.netlify.app/guides/defining-tasks/ A task is a function your workers run. Use a plain function for the simple case, or task() when you want timeouts, aborts, or imported worker code. A **task** is a function your workers run. The simplest task is just an exported function; wrap it with `task({ f })` when you want options like timeouts or abort signals. Either way, define tasks at module scope and export them so workers can find them by name. ## The rules - Define tasks at module scope — no conditional or dynamic exports. - Export them from the module where they are defined. - One argument in, one value out. Use a tuple or object for multiple values. - Keep tasks in their own file(s) so workers load only what they need. ## Module loading Each worker **re-imports the module that defines your tasks** — the file that calls `task()` / `importTask()` and hands them to `createPool`. Two consequences are worth designing around: - **Top-level `import`s run in every worker.** `import` statements are hoisted, so they execute *before* any `if (isMain)` guard — `isMain` gates your executable code, not your imports. If you define tasks in the same file as a web framework, every worker loads that framework too. Keep tasks in their own lean module and import it from your server, so workers only load what they run. - **Tasks must be exported.** The worker discovers tasks by scanning the module's exports. An unexported `const myTask = task(...)` (or `importTask`) is invisible to the loader, so calling it just **hangs** — no handler is ever registered. Always `export` your tasks and `importTask` wrappers. For full isolation — keeping a task's *own* code off the host entirely — use [`importTask`](#importing-worker-side-code-with-importtask): only the worker imports the target module, and that target must be a **plain exported function**, not a `task()` wrapper. ## A plain function When a task needs no options, a bare exported function is enough: ```ts import { createPool, isMain } from "knitting"; export const greet = (name: string) => `hello ${name}`; if (isMain) { using pool = createPool({ threads: 1 })({ greet }); console.log(await pool.call.greet("knitting")); // hello knitting } ``` It must be exported from the module that creates the pool — workers re-import that module to find it. Anonymous inline functions can't be imported, so reach for `task()` (below) when you need more than a plain function. ## Wrapping with `task()` Use `task({ f })` when you want options — a timeout, an abort signal, or explicit types: ```ts import { task } from "knitting"; export const add = task({ f: ([a, b]: [number, number]) => a + b, }); ``` Return types are inferred, but argument types are not — annotate the parameter, or pin both with generics: ```ts export const add = task<[number, number], number>({ f: ([a, b]) => a + b, }); ``` ## Arguments and return values Each task receives **one** argument and returns **one** value. For multiple inputs, pass a tuple or an object: ```ts type ResizeInput = { width: number; height: number }; export const pixels = task({ f: ({ width, height }) => width * height, }); ``` See [Payloads](/guides/payloads/) for everything that can cross the boundary. ### Promise inputs are awaited on the host `call.*()` also accepts a `Promise` as input. Knitting awaits it **on the host** before dispatch, so only plain values ever reach the worker: - input promise fulfilled → the worker runs with the resolved value; - input promise rejected → the host call rejects and the worker never runs; - only native `Promise` is awaited (thenables are not). That's why chaining works — `call.hello()` returns a promise, and Knitting resolves it before `world` runs: ```ts const lines = await pool.call.world(pool.call.hello()); ``` ## Options Beyond `f`, a task accepts a few options: | Option | Purpose | | --- | --- | | `timeout` | Bound how long a call may run. | | `abortSignal` | Make a task cancellable and abort-aware. | ### Timeouts Use a timeout when a call should not wait forever: ```ts export const maybeSlow = task({ timeout: { time: 100, default: "timed out" }, f: async (value) => value, }); ``` The shape of `timeout` decides what happens when the budget runs out: | Form | Outcome | | --- | --- | | `number` (ms) | Rejects with `Error("Task timeout")`. | | `{ time, default }` | Resolves with `default`. | | `{ time, maybe: true }` | Resolves with `undefined`. | | `{ time, error }` | Rejects with `error`. | A missing or negative `time` disables the timeout. > Note: The timeout races the task on the worker using the remaining budget (wall time > minus dispatch latency). The underlying work may still finish even after its > promise has resolved or rejected early. ### Abort signals Opt a task into cooperative cancellation. The minimal form, `abortSignal: true`, makes the task abort-aware — on `shutdown()`, in-flight abort-aware calls reject with `"Thread closed"`: ```ts export const longJob = task({ abortSignal: true, f: async (data: string) => { // long running work... return data.length; }, }); ``` `abortSignal: { hasAborted: true }` also injects a toolkit as the second argument, so the worker can poll `hasAborted()` and bail out early: ```ts export const cpuWork = task({ abortSignal: { hasAborted: true }, f: (items: number[], toolkit) => { let sum = 0; for (const item of items) { if (toolkit.hasAborted()) throw new Error("Task aborted"); sum += item; } return sum; }, }); ``` The promise returned by an abort-aware call also exposes `.reject()`, so the **host** can cancel without touching the worker: ```ts import { createPool, isMain, task } from "knitting"; export const slow = task({ abortSignal: true, f: async () => { await new Promise((r) => setTimeout(r, 10_000)); return "done"; }, }); if (isMain) { using pool = createPool({ threads: 1 })({ slow }); const promise = pool.call.slow(); setTimeout(() => promise.reject?.("cancelled by host"), 100); try { await promise; } catch (e) { console.log(e); // "cancelled by host" } } ``` > Note: Abort signals use a shared-memory bitset pool, sized by `abortSignalCapacity` > in `createPool` (default `258`). It applies whenever at least one task declares > `abortSignal`. When every slot is in use, further abort-aware calls reject with > the `AbortSignalPoolExhausted` symbol. ## Importing worker-side code with `importTask` `importTask({ href, name?, timeout?, abortSignal? })` points at a function in another module. The host gets a typed task wrapper but **never imports or evaluates that module itself** — only the worker does. That's the tool to reach for with [process workers](/guides/process-workers/) and sandboxing: keep the code you want isolated in a separate file and let the worker's permissions apply to it. ```ts // worker-tasks.ts — only the worker imports this. export const add = ([a, b]: [number, number]) => a + b; ``` ```ts // main.ts import { createPool, importTask, isMain } from "knitting"; export const add = importTask<[number, number], number>({ href: "./worker-tasks.ts", name: "add", }); if (isMain) { using pool = createPool({ threads: 2 })({ add }); console.log(await pool.call.add([2, 3])); // 5 } ``` `href` can be a relative path (resolved from the calling module), an absolute path, or a URL. `name` is the export to call and defaults to `"default"`. Worker permission policy applies to the import, so a strict pool can still load task modules but limit what they read, write, or reach. See [Permissions](/guides/permissions/). > Caution: The target export must be a **plain function** — pointing `importTask` at a > `task()` wrapper throws `TypeError: importTask expected export "…" to be a > function`. Put task options on the host side instead: `importTask` accepts the > same `timeout` and `abortSignal` options as `task()`. ### Importing from a URL `href` can be remote, which is handy for shared task bundles: ```ts import { createPool, importTask, isMain } from "knitting"; const REMOTE = "https://knittingdocs.netlify.app/example-task.mjs"; export const addFromWeb = importTask<[number, number], number>({ href: REMOTE, name: "add", }); if (isMain) { using pool = createPool({ threads: 2 })({ addFromWeb }); console.log(await pool.call.addFromWeb([8, 5])); // 13 } ``` > Note: On Deno, keep `deno.lock` current for remote imports — update it with > `deno cache --lock=deno.lock --frozen=false --reload`, then run frozen with > `--lock=deno.lock --frozen=true`. Re-cache and commit the lockfile whenever the > remote module changes. ## A task can make its own pool For a quick script with a single task, chain `.createPool()` and skip the separate call: ```ts import { isMain, task } from "knitting"; export const double = task({ f: (n: number) => n * 2, }).createPool({ threads: 2 }); if (isMain) { try { console.log(await double.call(21)); // 42 } finally { await double.shutdown(); } } ``` The single-task pool is created where it's defined (module scope), so close it with `shutdown()` rather than `using`. ## Advanced: overriding `href` on `task()` By default `task()` captures its own module URL and workers import from there. Passing `href` forces a different module path instead. > Caution: The `href` override on `task()` is not a stable public contract and may be > removed in a future major release. Prefer the default caller resolution; if you > must use it, pass an absolute `file://` URL to a module that exports the task at > top level, and pin your version. --- # Creating pools URL: https://knittingdocs.netlify.app/guides/creating-pools/ Spin up workers and call tasks. `createPool(options)(tasks)` starts worker threads and returns helpers: - `call.(args)` enqueue a task and return a promise. - `shutdown(delayMs?)` stop workers immediately or after an optional delay. - `[Symbol.dispose]` so a `using` declaration closes the pool when its scope ends. Prefer `using pool = createPool(...)({ ... })` and let the pool dispose itself. `shutdown()` is still there, though — call `await pool.shutdown()` to close the pool **before** the `using` scope ends, when you need to await teardown, or when you are on a runtime without `using`. `using` simply calls `shutdown()` for you at scope exit. `call.*()` always returns a promise. Inputs may also be native promises; they are resolved on the host before dispatch. If an input promise rejects, the host call rejects and the worker task is not executed. See [Promise inputs and awaited outputs](/guides/defining-tasks#promise-inputs-and-awaited-outputs). > Note: `createPool(...)` can only be called from the main thread. Use `isMain` to guard > entrypoints that create pools. ## Batching pattern `call.*()` enqueues work and dispatches automatically. For batches, create all calls first and then await them together. ```ts const jobs = Array.from({ length: 1_000 }, () => call.hello()); const results = await Promise.all(jobs); ``` ## Options ```ts createPool({ threads?: number, inliner?: { position?: "first" | "last", batchSize?: number, dispatchThreshold?: number, }, balancer?: { strategy?: | "roundRobin" | "robinRound" | "firstIdle" | "randomLane" | "firstIdleOrRandom" } | "roundRobin" | "robinRound" | "firstIdle" | "randomLane" | "firstIdleOrRandom", worker?: { runtime?: "thread" | "process", processRuntime?: "node" | "deno" | "bun", processCommandPrefix?: string[], processSharedMemory?: "inherit" | "named" | { mode?: "inherit" | "named", namePrefix?: string, unlinkOnShutdown?: boolean, }, bootstrap?: { href: string, name?: string, data?: unknown }, resolveAfterFinishingAll?: true, timers?: { spinMicroseconds?: number, parkMs?: number, pauseNanoseconds?: number, }, hardTimeoutMs?: number, resourceLimits?: { maxOldGenerationSizeMb?: number, maxYoungGenerationSizeMb?: number, codeRangeSizeMb?: number, stackSizeMb?: number, }, }, payload?: { mode?: "growable" | "fixed", payloadInitialBytes?: number, payloadMaxByteLength?: number, maxPayloadBytes?: number, }, abortSignalCapacity?: number, host?: { stallFreeLoops?: number, maxBackoffMs?: number, }, workerExecArgv?: string[], permission?: "strict" | "unsafe" | PermissionProtocol, dispatcher?: DispatcherSettings, // deprecated alias of host debug?: boolean | { host?: boolean, globals?: boolean, signals?: boolean, imports?: boolean, lifecycle?: boolean, }, source?: string, }) ``` Deprecated payload aliases are still accepted at the top level: - `payloadInitialBytes` -> `payload.payloadInitialBytes` - `payloadMaxBytes` -> `payload.payloadMaxByteLength` - `bufferMode` -> `payload.mode` - `maxPayloadBytes` -> `payload.maxPayloadBytes` ### threads Number of worker threads to spawn (default `1`). The total lane count is `threads + (inliner ? 1 : 0)`. ## payload Payload transport tuning lives under `payload`. ### mode Selects shared-buffer transport mode: - `"growable"`: growable shared array buffer mode. - `"fixed"`: fixed-size shared array buffer mode. Default is `"growable"` when SAB growth is available, otherwise `"fixed"`. ### payloadMaxByteLength Maximum size (in bytes) each payload buffer may grow to. Default is `64 MiB`. ### payloadInitialBytes Initial payload buffer size in bytes. Default is `4 MiB` in growable mode. In fixed mode, startup uses the full `payloadMaxByteLength`. ### maxPayloadBytes Hard cap for dynamic payload encoding. Must be `> 0` and `<= payloadMaxByteLength >> 3`. Default is `payloadMaxByteLength >> 3` (`8 MiB` with defaults). Calls that exceed this cap are rejected with `KNT_ERROR_3` before dynamic slot reservation. If `payload.mode` is `"fixed"` and payload size is under the cap but still does not fit current capacity, the call is rejected with a controlled encoder error. ### Scope Payload limits apply per worker and per direction (request payload + return payload), so each worker allocates two payload buffers with these limits. ### abortSignalCapacity Maximum number of concurrent abort-aware calls the pool can track. Default is `258`, and it applies only when at least one task declares `abortSignal`. Only tasks defined with `abortSignal: true` or `abortSignal: { hasAborted: true }` count against this limit. ```ts const pool = createPool({ threads: 4, abortSignalCapacity: 1024, })({ myAbortableTask }); ``` ## Example Basic example: ```ts import { createPool, isMain, task } from "knitting"; export const add = task<[number, number], number>({ f: async ([a, b]) => a + b, }); if (isMain) { using pool = createPool({ threads: 2 })({ add }); const results = await Promise.all([ pool.call.add([1, 2]), pool.call.add([3, 4]), ]); console.log(results); // [3, 7] } ``` > Danger: After `shutdown()` runs, in-flight and future `call.*()` promises reject > with `"Thread closed"`. ## balancer Controls how calls are routed across lanes (threads, plus optional inliner). Pass a string or an object with a `strategy` key. - `roundRobin` (default): round-robin rotation through all lanes. - `robinRound`: legacy alias of `roundRobin`. - `firstIdle`: pick the first idle lane, else fall back to round-robin. - `randomLane`: pick a random lane. - `firstIdleOrRandom`: pick the first idle lane, else random. When only one thread is spawned and no inliner is enabled, the balancer is bypassed and calls go directly to the single worker. ## inliner Adds an extra lane that runs tasks on the main thread. - `position`: whether the inline lane appears before (`"first"`) or after (`"last"`) the worker lanes for balancing. - `batchSize`: max tasks processed per event-loop tick (default `1` when enabled). - `dispatchThreshold`: minimum in-flight calls per invoker before inline lane is eligible (default `1`). See [Inliner guide](/guides/inliner) for detail. ## worker ### runtime `"thread"` (default) runs workers as runtime-local threads — the lowest-overhead option. `"process"` runs each worker as a separate OS process for stronger isolation, and unlocks `processRuntime`, `processCommandPrefix`, and `processSharedMemory`. See [Process workers](/guides/process-workers) for the full story — sandboxes, containers, and the stdin / fd-0 handshake. ### bootstrap A privileged module imported and awaited once per worker before task modules load: `{ href, name?, data? }`. Use it to install runtime guards, strip environment variables, or prepare worker-only globals. Bootstrap is worker-only and cannot be combined with the inline lane. ### resolveAfterFinishingAll When set to `true`, workers wait for all pending promises to settle before exiting. ### timers Idle behavior tuning while workers have no work: - `spinMicroseconds`: busy-spin budget before parking. - `parkMs`: `Atomics.wait` timeout while parked. - `pauseNanoseconds`: `Atomics.pause` duration while spinning. Set `0` to disable. ### hardTimeoutMs Hard wall-clock timeout for each task call. On timeout, the pool force-shuts down to stop runaway CPU execution. ### resourceLimits Node.js worker memory/stack limits: - `maxOldGenerationSizeMb` - `maxYoungGenerationSizeMb` - `codeRangeSizeMb` - `stackSizeMb` ## workerExecArgv Extra Node.js `execArgv` flags passed to workers, for example `["--expose-gc", "--max-old-space-size=4096"]`. When `permission` is set to `"unsafe"`, inherited Node permission flags (`--allow-fs-read`, `--allow-fs-write`, etc.) are stripped. ## permission Runtime permission flag policy for workers. - Omit `permission`: strict defaults plus `allowImport: true` (web imports allowed). - `"strict"` (default when passing an object): computes conservative defaults. - `"unsafe"`: disables permission flags and strips inherited Node permission flags. - In object mode, `console` defaults to `false` in strict mode and `true` in unsafe mode. See [Permissions guide](/guides/permissions) for runtime-specific mapping and strict defaults. ## Timing note Worker timing paths capture a high-resolution `performance.now()` reference at module load/startup time. This keeps scheduling and timeout precision stable without freezing global `performance`. ## Safety hardening defaults - Startup-only guard layer: safety hooks are installed once before the worker loop starts (no extra checks inside the hot task-processing loop). - Process termination APIs from task code are blocked: `process.exit`, `process.kill`, `process.abort`, plus `Deno.exit` when present. - Permission enforcement is delegated to runtime-native mechanisms (Node worker permission flags and Deno worker permissions when available). In-process FS/network/env monkey-patching is not performed. ## host Host dispatcher backoff and scheduling options. ### stallFreeLoops How many notify loops run before backoff starts (default `128`). ### maxBackoffMs Maximum backoff delay in milliseconds once the dispatcher starts stalling (default `10`). ## dispatcher Deprecated alias of `host`. ## debug Streams diagnostics to **stderr**, each line tagged with the worker (`host`, or `w0`, `w1`, … for the thread/process workers), the runtime, and a millisecond timer relative to when that worker's debug initialised. Pass `true` to enable everything, or turn on individual namespaces: - `host`: host-side pool setup — cwd and caller, each registered task, runtime / workers / lanes / inliner, the module list, permission mode, and worker bootstrap. - `imports`: how many tasks each worker loaded, and from which modules. - `lifecycle`: the worker "ready" line and process-worker lifecycle events. - `signals`: per-dispatch worker traffic (work / result / run / idle). Very chatty. - `globals`: `globalThis` changes across the worker's bootstrap and task phases, so you can see which loader injected which global. Enable the same namespaces without touching code through the `KNITTING_DEBUG` environment variable — a comma-separated list (`KNITTING_DEBUG=host,imports`) or `*` for all. The option and the env var merge; either one can turn a namespace on. Debug is zero-cost when off: with no namespace active, the logger module is never even imported. ## source Override worker entry module URL/path explicitly. ## Limits A single pool supports up to **65,536 tasks** (function IDs are stored as `Uint16`, range `0..0xFFFF`). Passing more tasks throws a `RangeError`. --- # Payloads URL: https://knittingdocs.netlify.app/guides/payloads/ Data types you can send through knitting. The transport supports the following payloads: - `number` (including `NaN`, `Infinity`, and `-Infinity`) - `string` - `boolean` - `bigint` - `undefined` and `null` - plain JSON-like `Object` and `Array` - `Envelope` where `H` is JSON-like and `B` is `ArrayBuffer` (default), `SharedArrayBuffer`, `ProcessSharedBuffer`, or `BufferReference` - `Buffer` (Node.js), `ArrayBuffer` - `Uint8Array`, `Int32Array`, `Float64Array`, `BigInt64Array`, `BigUint64Array` - `DataView` - `ProcessSharedBuffer` for zero-copy shared memory (see [Shared memory](/guides/shared-memory/)) - `BufferReference` from `knitting/unsafe` for zero-copy buffers to thread workers (see [Buffer reference](/guides/buffer-reference/)) - `Error` (name, message, stack, and cause chain) - `Date` - `symbol` from `Symbol.for(...)` only - native `Promise` values at the host call boundary If you need multiple values, pass a tuple or object as the single argument. These types are not supported directly: - `Map`, `Set`, `WeakMap`, and custom class instances (except `Envelope` and subclasses) - non-global symbols - `Blob` - functions Promise values are accepted at the `call.*()` boundary, but promises themselves are runtime state and are not transferred through IPC as payloads. Only resolved values are serialized and sent to workers. If a promise input rejects, the host call rejects and the worker task is not executed. Only native `Promise` is accepted; thenables are treated as regular values. See [Promise inputs and awaited outputs](/guides/defining-tasks#promise-inputs-and-awaited-outputs) for details. ```ts type JSONValue = | string | number | boolean | null | JSONArray | JSONObject; interface JSONObject { [key: string]: JSONValue; } interface JSONArray extends Array {} type ValidInput = | bigint | void | JSONValue | symbol | Envelope | Uint8Array | Int32Array | Float64Array | BigInt64Array | BigUint64Array | DataView | ArrayBuffer | Error | Date; type Args = ValidInput | Serializable; type TaskInput = NoBlob | Promise>; ``` ## Picking the right payload type Different types take different code paths and have very different costs. Use the cheapest type that fits your data. **Header-only (fastest):** `number`, `boolean`, `undefined`, `null`, `Date`, small `string`, small `bigint`. These fit in the call header with near-zero overhead. Prefer these whenever possible. **Static payload (fast):** `Symbol.for(...)`, large `bigint`, typed arrays (`Uint8Array`, `Int32Array`, etc.). Reuses a small buffer region alongside the header. **Dynamic payload (allocator path):** `Object`, `Array` (JSON-serialized), `Error`, `Date`, and larger strings. These need allocation, serialization, and copying. Still faster than `postMessage`, but measurably heavier in hot loops. See the [Performance guide](/guides/performance) for per-type benchmarks and batching guidance. ## Envelope `Envelope` pairs a JSON-serializable header with a binary body. Use it when a call needs both structured metadata and raw bytes — the transport carries one special binary value per call, so an envelope is how you attach a header to one. ```ts import { Envelope } from "knitting"; const message = new Envelope( { route: "/upload", contentType: "application/octet-stream" }, new Uint8Array([1, 2, 3]).buffer, ); ``` The second type parameter `B` sets the body type and defaults to `ArrayBuffer`. The supported body types are: | Body | Copy? | Works with | Notes | |------|-------|------------|-------| | `ArrayBuffer` | copied | thread + process | Default. Works everywhere. | | `SharedArrayBuffer` | zero-copy, shared | thread only | Shared by reference; process workers reject it. | | `ProcessSharedBuffer` | zero-copy, shared | thread + process | Cross-process shared memory. | | `BufferReference` | zero-copy, moved | thread only | From `knitting/unsafe`. Source is detached on construction. | `Envelope` is disposable — `[Symbol.dispose]()` disposes a disposable body (a `BufferReference`) and is a no-op for `ArrayBuffer` and `SharedArrayBuffer`. Use `using` to dispose automatically when the envelope goes out of scope: ```ts using result = await pool.call.processImage( new Envelope({ format: "png" }, buffer), ); console.log(result.header); // result is disposed here when the `using` scope exits ``` For `BufferReference` bodies, see [Buffer reference](/guides/buffer-reference/). For `ProcessSharedBuffer` bodies, see [Shared memory](/guides/shared-memory/). ## Errors `Error` payloads preserve `name`, `message`, `stack`, and recursive `cause` chains. Both sync throws and async rejections inside tasks propagate back to the host as `Error` objects. See also: [knitting/utils](/guides/utils/) for buffer serialization helpers and [Buffer reference](/guides/buffer-reference/) for zero-copy thread payloads. --- # Buffer utilities URL: https://knittingdocs.netlify.app/guides/utils/ knitting/utils — helpers for serializing strings, JSON, and numbers into SharedArrayBuffer and back. `knitting/utils` is a set of helpers for converting between JavaScript values and raw buffer bytes. Import from the subpath: ```ts import { bufferToBytes, bytesToBuffer, bufferToString, stringToBuffer, bufferToJson, jsonToBuffer, numbersToBuffer, bufferToNumbers, } from "knitting/utils"; ``` All functions accept `ArrayBuffer`, `SharedArrayBuffer`, or any typed-array view as input (`BufferLike`). Encode functions return `SharedArrayBuffer` so the result can be handed to a worker payload or stored in shared memory without an extra copy. --- ## Strings ```ts const sab = stringToBuffer("hello"); // UTF-8 → SharedArrayBuffer const text = bufferToString(sab); // SharedArrayBuffer → string ``` `stringToBuffer` uses Node's `Buffer.from` when available (avoids a TextEncoder allocation) and falls back to `TextEncoder` on Deno and Bun. --- ## JSON ```ts const sab = jsonToBuffer({ status: "ok" }); // JSON.stringify → SharedArrayBuffer const obj = bufferToJson(sab); // SharedArrayBuffer → parsed value ``` `bufferToJson` is `JSON.parse(bufferToString(source))`. If the value is not JSON-serializable, `jsonToBuffer` throws. --- ## Raw bytes ```ts const bytes: Uint8Array = bufferToBytes(source); // any BufferLike → Uint8Array const sab: SharedArrayBuffer = bytesToBuffer(source); // any BufferLike → SharedArrayBuffer ``` `bufferToBytes` does not copy if the source is already a `Uint8Array`. `bytesToBuffer` always copies into a fresh `SharedArrayBuffer`. --- ## Numbers ```ts type NumberFormat = "f64" | "f32" | "i32"; // default: "f64" const sab = numbersToBuffer([1.1, 2.2, 3.3], { format: "f64" }); const arr: Float64Array = bufferToNumbers(sab, { format: "f64" }); const isab = numbersToBuffer([1, 2, 3], { format: "i32" }); const iarr: Int32Array = bufferToNumbers(isab, { format: "i32" }); ``` `bufferToNumbers` returns a typed-array view (zero-copy when the byte offset is aligned). If the buffer's byte length is not a multiple of the element size for the chosen format, it throws a `RangeError`. --- ## Typical use Passing a string to a worker through shared memory so the bytes are not copied on every call: ```ts import { createPool, isMain, task } from "knitting"; import { getDefaultProcessSharedBufferPrimitives, ProcessSharedBuffer, } from "knitting/shared-memory"; import { stringToBuffer, bufferToString } from "knitting/utils"; export const shout = task({ f: (buf) => bufferToString(buf.view(Uint8Array)).toUpperCase(), }); if (isMain) { using pool = createPool({ threads: 1 })({ shout }); const primitives = getDefaultProcessSharedBufferPrimitives(); const encoded = stringToBuffer("hello knitting"); // SharedArrayBuffer const shared = ProcessSharedBuffer.create(encoded.byteLength, primitives); new Uint8Array(shared.view(Uint8Array)).set(new Uint8Array(encoded)); console.log(await pool.call.shout(shared)); // HELLO KNITTING shared.descriptor.mapping?.close?.(); } ``` See [Payloads](/guides/payloads/) for the full list of types that can cross the worker boundary. --- # Permissions URL: https://knittingdocs.netlify.app/guides/permissions/ Runtime permission policy for worker processes. Workers execute task code, so permission policy controls what those workers can access at runtime. Set `permission` in `createPool(...)`: ```ts import { createPool, isMain, task } from "knitting"; export const work = task({ f: async (x: number) => x * 2, }); if (isMain) { using pool = createPool({ threads: 2, permission: { mode: "strict" }, })({ work }); console.log(await pool.call.work(21)); // 42 } ``` ## Modes - omit `permission` strict defaults plus `allowImport: true` (web imports allowed). - `permission: {}` or `permission: { mode: "strict" }` conservative defaults in explicit strict mode. - `permission: "unsafe"` disables permission flags and strips inherited Node permission flags. `console` can be set in object mode for compatibility. Default is `false` in strict mode and `true` in unsafe mode. ```ts createPool({ permission: { mode: "strict", console: true } })({ work }); createPool({ permission: "unsafe" })({ work }); ``` ## Fine-grained policy In object mode you grant exactly what your tasks need — anything you don't list stays denied: ```ts createPool({ permission: { mode: "strict", allowImport: true, // allow task-module imports read: ["./data"], // path allow-list (or `true` for all) write: ["./out"], net: ["api.example.com"], // host allow-list (or `true` for all) env: { allow: ["NODE_ENV"] }, run: ["git"], // subprocess allow-list console: true, }, })({ work }); ``` | Field | Meaning | | --- | --- | | `read` / `write` | Path allow-lists. `true` means unrestricted. | | `denyRead` / `denyWrite` | Explicit deny entries layered on top. | | `net` / `denyNet` | Network host allow / deny lists. | | `allowImport` | Modules the worker may import. `true` allows all. | | `env` | `{ allow, deny, files }` for environment-variable access. | | `run` / `denyRun` | Subprocess execution allow / deny lists. | | `console` | Let worker `console.*` reach the host. | Enforcement uses each runtime's native mechanism, so coverage varies (see [Runtime mapping](#runtime-mapping)). Treat permissions as a guardrail, not the only boundary for hostile code — for that, combine them with [process workers](/guides/process-workers/). ## Strict defaults Strict mode computes a conservative profile: - read/write rooted at current `cwd` - deny-write for `node_modules` - deny read/write for sensitive paths: `.env`, `.git`, `.npmrc`, `.docker`, `.secrets`, `~/.ssh`, `~/.gnupg`, `~/.aws`, `~/.azure`, `~/.config/gcloud`, `~/.kube` - deny read/write for POSIX-sensitive paths: `/proc`, `/proc/self`, `/proc/self/environ`, `/proc/self/mem`, `/sys`, `/dev`, `/etc` - read support for `deno.lock` and `bun.lock*` `permission: "unsafe"` disables runtime permission flags and strips inherited Node permission flags from worker `execArgv`. ## Runtime mapping Permission protocol values are mapped to each runtime differently. ### Node.js workers Workers receive `--permission` / `--experimental-permission` plus: - `--allow-fs-read` - `--allow-fs-write` - `--allow-worker` - `--allow-child-process` - `--allow-addons` - `--allow-wasi` Node worker flags are allow-list based, so protocol deny lists are not expressible as Node worker flags. ### Deno workers Workers receive `Worker.deno.permissions` when enabled. This is applied only when one of these is true: - `--unstable-worker-options` is detected (Linux `/proc` probe), or - `KNITTING_DENO_WORKER_PERMISSIONS=1` is set. ### Bun workers Bun currently has no worker permission flags. Protocol values are accepted for API compatibility but are not enforced by Bun runtime flags. ## Process execution overrides Object mode supports runtime-specific process execution overrides: - `node.allowChildProcess?: boolean` - `deno.allowRun?: boolean` — legacy, superseded by the top-level `run` allow-list. Both default to `false` in strict mode. --- # Process workers URL: https://knittingdocs.netlify.app/guides/process-workers/ Run each worker as a separate OS process for stronger isolation — through a sandbox like bubblewrap or a container like Docker. By default Knitting runs workers as **threads** — the lowest-overhead option. When you need stronger isolation, run each worker as a **separate process**. A process worker has its own memory and permissions, and — because it's just a child process — you can launch it inside a sandbox like `bwrap` or a container like Docker. ```ts const pool = createPool({ threads: 2, worker: { runtime: "process", processRuntime: "node", // "node" | "deno" | "bun" (default "deno") }, })({ add }); ``` That's the whole switch: the same `call` API and the same tasks, now running in another process. Everything below is about wrapping that process. ## Keep isolated code out of the host Isolation only helps if the code you want contained never runs on the host. Use [`importTask`](/guides/defining-tasks/#importing-worker-side-code-with-importtask) so the host holds a typed wrapper while only the worker imports and evaluates the module: ```ts // worker-tasks.ts — only the sandboxed worker imports this. export const add = ([a, b]: [number, number]) => a + b; ``` ```ts // main.ts export const add = importTask<[number, number], number>({ href: "./worker-tasks.ts", name: "add", }); ``` ## The fd-0 handshake Process workers receive their shared-memory handle on **stdin — file descriptor 0**. That one detail decides how a wrapper has to behave: - Wrappers that leave stdin alone (most sandboxes) work as-is, inheriting the fd. - Wrappers that replace, close, or proxy stdin (most containers) break the handshake. For those, switch to **named** shared memory so the worker reopens the mapping by name instead of inheriting an fd: ```ts worker: { runtime: "process", processSharedMemory: "named", // or { mode: "named", namePrefix: "knit" } } ``` Named memory needs both sides in the **same OS IPC namespace** (for Docker, `--ipc=host`). > Note: On Windows, Knitting always uses named shared memory for process workers, so you > don't need to set `processSharedMemory` yourself. ## `processCommandPrefix` `processCommandPrefix` is the wrapper command placed **before** Knitting's own worker launch. Knitting appends the real command — runtime plus the worker file — after your prefix, so the prefix is just "how to start a process that will then run the worker." ### Bubblewrap (keeps fd 0) `bwrap` preserves stdin, so the inherited-fd path works and no named memory is needed. This runs Bun workers with new namespaces (no network) and a read-only filesystem: ```ts import { createPool, importTask, isMain } from "knitting"; export const add = importTask<[number, number], number>({ href: "./worker-tasks.ts", name: "add", }); if (isMain) { using pool = createPool({ worker: { runtime: "process", processRuntime: "bun", processCommandPrefix: [ "bwrap", "--unshare-all", // new namespaces, no network "--ro-bind", "/", "/", // read-only filesystem "--dev-bind", "/dev", "/dev", "--proc", "/proc", "--tmpfs", "/tmp", // writable scratch "--die-with-parent", ], }, })({ add }); console.log(await pool.call.add([1, 2])); // 3 } ``` > Caution: `--tmpfs ` masks whatever is at that path. Don't put your project — the > files the worker imports — under a tmpfs-masked path, or the worker can't load > them. And remember it's isolation, not a full sandbox: a read-only bind is still > readable. ### Docker (named shared memory) Containers replace stdin, so use `processSharedMemory: "named"` and share the IPC namespace with `--ipc=host`. The container also needs to see the same files at the same path (a volume mount) and receive Knitting's two boot env vars. ```ts // docker-worker-tasks.ts — imported only inside the container. import { isMain } from "knitting"; export const addOne = (n: number) => n + 1; export const reportIsMain = () => isMain; // false: this runs off the host ``` ```ts // main.ts import { spawn } from "node:child_process"; import { mkdtemp, readFile, rm } from "node:fs/promises"; import { tmpdir } from "node:os"; import { join } from "node:path"; import { createPool, importTask, isMain } from "knitting"; const docker = process.env.DOCKER_BINARY ?? "docker"; const image = process.env.KNITTING_DOCKER_IMAGE ?? "node:24-trixie-slim"; const cwd = process.cwd(); export const addOne = importTask({ href: "./docker-worker-tasks.ts", name: "addOne", }); export const reportIsMain = importTask({ href: "./docker-worker-tasks.ts", name: "reportIsMain", }); // Record the container id so we can always clean it up. const dockerPrefix = (cidfile: string): string[] => [ docker, "run", "--ipc=host", // share the shared-memory namespace "--cidfile", cidfile, "-v", `${cwd}:${cwd}`, // same files, same path "-w", cwd, "-e", "KNITTING_PROCESS_WORKER", // forward Knitting's boot payload "-e", "KNITTING_PROCESS_WORKER_BOOT", image, ]; const removeContainer = async (cidfile: string) => { const id = await readFile(cidfile, "utf8").then((s) => s.trim()).catch(() => ""); if (!id) return; await new Promise((resolve) => { spawn(docker, ["rm", "-f", id], { stdio: "ignore" }) .once("error", () => resolve()) .once("exit", () => resolve()); }); }; if (isMain) { const dir = await mkdtemp(join(tmpdir(), "knitting-docker-")); const cidfile = join(dir, "worker.cid"); const pool = createPool({ threads: 1, worker: { runtime: "process", processRuntime: "node", processSharedMemory: "named", processCommandPrefix: dockerPrefix(cidfile), }, permission: "unsafe", // the container is the boundary here })({ addOne, reportIsMain }); try { const [value, workerIsMain] = await Promise.all([ pool.call.addOne(41), pool.call.reportIsMain(), ]); console.log({ value, workerIsMain, ok: value === 42 && workerIsMain === false }); } finally { await removeContainer(cidfile); await pool.shutdown().catch(() => undefined); await rm(dir, { recursive: true, force: true }); } } ``` Why `permission: "unsafe"` here? The container — not Knitting's per-worker flags — is the isolation boundary, so the in-container worker runs without extra permission flags. Keep the image and mounts tight instead. See [Permissions](/guides/permissions/). ## Choosing a wrapper | Wrapper | stdin (fd 0) | Shared memory | | --- | --- | --- | | None (plain process) | inherited | inherited (default) | | `bwrap` / sandbox | preserved | inherited (default) | | Docker / container | replaced | `"named"` + `--ipc=host` | This is same-host communication. Named shared memory is fast because both sides map the same bytes, but it's not a network transport. For the lower-level building block behind all of this, see [Shared memory](/guides/shared-memory/). --- # Shared memory URL: https://knittingdocs.netlify.app/guides/shared-memory/ ProcessSharedBuffer — the lower-level shared-memory channel for passing bytes between workers and processes without copying. `ProcessSharedBuffer` is the building block under process workers: a block of shared memory two processes can read and write **without copying** the payload on every call. Reach for it when workers or processes need to see the *same bytes* — counters, ring buffers, large frames — instead of message-passing copies. It lives on a subpath: ```ts import { getDefaultProcessSharedBufferPrimitives, ProcessSharedBuffer, } from "knitting/shared-memory"; ``` The `primitives` are the platform's shared-memory functions. Grab the defaults once and reuse them. ## Anonymous buffers (parent ↔ child) The default is anonymous: a private handle passed intentionally through Knitting's transport. It's the safest option and needs no name. ```ts import { createPool, isMain, task } from "knitting"; import { getDefaultProcessSharedBufferPrimitives, ProcessSharedBuffer, } from "knitting/shared-memory"; export const readFirstCell = task({ f: (buffer) => Atomics.load(buffer.view(Int32Array), 0), }); if (isMain) { using pool = createPool({ threads: 1 })({ readFirstCell }); const primitives = getDefaultProcessSharedBufferPrimitives(); const shared = ProcessSharedBuffer.create(64, primitives); try { Atomics.store(shared.view(Int32Array), 0, 42); console.log(await pool.call.readFirstCell(shared)); // 42 } finally { shared.descriptor.mapping?.close?.(); } } ``` A `ProcessSharedBuffer` is a supported [payload](/guides/payloads/), so you pass it straight to a task. `view(Int32Array)` returns a typed-array view over the same memory — pair it with `Atomics` for safe cross-process reads and writes. ## Named channels (independent processes) When two processes don't share a parent — so there's no fd to inherit — use a **named** channel. One side creates the name, the other opens it. ```ts const name = "knitting-demo-channel"; const primitives = getDefaultProcessSharedBufferPrimitives(); const owner = ProcessSharedBuffer.create( { name, size: 64, mode: "create" }, primitives, ); try { Atomics.store(owner.view(Int32Array), 0, 7); const peer = ProcessSharedBuffer.create( { name, size: 64, mode: "open" }, primitives, ); try { console.log(Atomics.load(peer.view(Int32Array), 0)); // 7 } finally { peer.descriptor.mapping?.close?.(); } } finally { owner.descriptor.mapping?.close?.(); primitives.unlinkSharedMemory?.(name); } ``` Use `"create"` on the owner and `"open"` on the peer. **The name is the capability** — anyone who knows it can map the memory — so generate a hard-to-guess name, keep it private, and `unlinkSharedMemory` it when you're done. ## Sending one to a container Docker process workers can receive a `ProcessSharedBuffer`, but it must be **named** — the default anonymous form is fd-backed and private to the parent/child path, which a container can't reopen. Create the payload with `mode: "create"` and a name, run the pool with `processSharedMemory: "named"`, and add `--ipc=host` so the container shares the namespace. See [Process workers](/guides/process-workers/) for the pool side. ## Cleaning up Shared memory is not garbage-collected for you: - Close every mapping you open with `descriptor.mapping?.close?.()`. - For named channels, the owner also calls `primitives.unlinkSharedMemory?.(name)` once nobody needs the name anymore. This is a same-host, fast path — both sides map the same bytes — not a network transport. Anonymous is the safe default; reach for named only when processes can't inherit a handle. For thread-only zero-copy transfers within the same process, see [Buffer reference](/guides/buffer-reference/). --- # Buffer reference URL: https://knittingdocs.netlify.app/guides/buffer-reference/ knitting/unsafe — zero-copy ArrayBuffer handles for thread workers. `BufferReference` is a zero-copy handle for moving `ArrayBuffer` bytes to **thread** workers. Import it from the subpath: ```ts import { BufferReference } from "knitting/unsafe"; ``` `knitting/unsafe` is experimental, so the API may still change. Used with its defaults it is memory-safe on every runtime — the only mode that needs care is the `borrow` return ([see below](#return-copy-vs-borrow)), and even that is safe on Node. --- ## Move semantics Constructing a `BufferReference` **detaches the source immediately**. The bytes now belong to the reference; reading or writing the original buffer after construction returns zero-length or stale data. ```ts const pixels = new Uint8Array([0, 64, 128, 192, 255]); const ref = new BufferReference(pixels); // pixels.buffer is now detached console.log(pixels.byteLength); // 0 — the source was moved console.log(ref.byteLength); // 5 ``` This is intentional: the move is what makes the transfer zero-copy. --- ## Sending to a worker Use `BufferReference` as a task argument to move bytes to a thread worker without serializing them through the transport: ```ts import { createPool, isMain, task } from "knitting"; import { BufferReference } from "knitting/unsafe"; export const invert = task({ f: (ref) => { const pixels = ref.toUint8Array(); // the moved bytes, no copy const out = new Uint8Array(pixels.length); for (let i = 0; i < pixels.length; i++) out[i] = 255 - pixels[i]; return new BufferReference(out); // move the result back }, }); if (isMain) { const pixels = new Uint8Array([0, 64, 128, 192, 255]); using pool = createPool({ threads: 1 })({ invert }); const result = await pool.call.invert(new BufferReference(pixels)); console.log([...result.toUint8Array()]); // [255, 191, 127, 63, 0] } ``` --- ## Reading the bytes Two accessors materialize the moved region: | Method | Returns | Notes | |--------|---------|-------| | `toUint8Array()` | `Uint8Array` | View over the moved bytes. | | `toArrayBuffer()` | `ArrayBuffer` | The underlying buffer. | These views are backed by the reference, so use them while it is alive and let `using` / `release()` clean up when you are done. In the default copy mode this is ordinary lifecycle hygiene, not a safety hazard — the stricter rules only apply to the `borrow` return mode below. --- ## Disposing `BufferReference` implements `Symbol.dispose`. Use `using` to release automatically, or call `release()` explicitly: ```ts { using result = await pool.call.invert(new BufferReference(pixels)); const out = result.toUint8Array(); console.log([...out]); } // result is released here ``` After `release()` the reference no longer owns the bytes, so stop using any view you took from it. In the default copy mode this is just normal cleanup; the only mode where reading too late is an actual use-after-free is `borrow` on Deno and Bun (see below). --- ## Constraints - **Thread workers only.** The handle is a process-local pointer. Sending a `BufferReference` to a process worker throws. For cross-process sharing use `ProcessSharedBuffer` (see [Shared memory](/guides/shared-memory/)). - **`ArrayBuffer` sources only.** `SharedArrayBuffer` cannot be detached and is rejected. SAB-backed typed-array views are also rejected. - **One-shot.** Each reference is used once. The worker materializes the bytes with `toUint8Array()` / `toArrayBuffer()`, and those values are borrowed for the duration of the call. Do not use them from fire-and-forget work after the task returns. --- ## Return copy vs borrow When a worker returns a `BufferReference`, the host must take possession of the bytes. The default is **copy** (safe on all runtimes): ```ts using pool = createPool({ threads: 1, unsafe: { BufferReferenceReturn: "copy" }, })({ invert }); ``` On Node.js the return is always zero-copy regardless of this setting — the engine co-owns the backing store across threads, so there is no use-after-free risk even with `"borrow"`. On Deno and Bun, `"copy"` takes a single copy of the returned bytes so they safely outlive the worker. Set it to `"borrow"` to skip that copy: ```ts import { BufferReferenceReturn } from "knitting/unsafe"; using pool = createPool({ threads: 1, unsafe: { BufferReferenceReturn: BufferReferenceReturn.Borrow }, })({ invert }); { using result = await pool.call.invert(new BufferReference(pixels)); const out = result.toUint8Array(); // borrowed — valid only while result lives console.log([...out]); } // result is released here; do not read out after this point ``` `"borrow"` only carries risk on **Deno and Bun**: the host reads directly into the worker's memory, so you must call `release()` (or let `using` do it) **before** the producing worker shuts down, and if the bytes escape into HTTP responses, streams, timers, or caches you must copy them first. On Node there is no such hazard — the return is zero-copy and safe either way. The named constants `BufferReferenceReturn.Copy` (`"copy"`) and `BufferReferenceReturn.Borrow` (`"borrow"`) are exported from `knitting/unsafe`. --- ## As an `Envelope` body `BufferReference` can be the body of an `Envelope`, combining a JSON header with zero-copy bytes: ```ts import { Envelope } from "knitting"; import { BufferReference } from "knitting/unsafe"; export const process = task< Envelope<{ op: string }, BufferReference>, Envelope<{ done: boolean }, BufferReference> >({ f: (env) => { const pixels = env.payload.toUint8Array(); const out = new Uint8Array(pixels.length); for (let i = 0; i < pixels.length; i++) out[i] = 255 - pixels[i]; return new Envelope({ done: true }, new BufferReference(out)); }, }); ``` Disposing the envelope also disposes a `BufferReference` body. Disposing is a no-op for `ArrayBuffer` and `SharedArrayBuffer` bodies. See [Payloads — Envelope](/guides/payloads#envelope) for the full body type table. --- ## When to reach for it Below roughly 256 KiB the per-call pointer setup tends to cost more than just copying through the shared transport. Reach for `BufferReference` only when the copy cost of a large buffer to a thread worker actually matters in profiling. For process workers, use `ProcessSharedBuffer` instead. For smaller buffers, a plain `ArrayBuffer` or typed array is simpler and works for both thread and process workers. --- # Performance URL: https://knittingdocs.netlify.app/guides/performance/ Tips about performance and good practices. ## Quick legend Bear in mind that even Slow here can still be **2-4x faster** than `postMessage` (depending on payload and workload). - Best: near "header-only" cost, best path - Fast: still very cheap per call - Good: fine for real workloads - Fair: watch frequency - Slow: avoid in hot loops, consider alternatives ### Rating thresholds (per call, easy to tweak later) These tiers are intentionally simple. If you re-run benchmarks on a new CPU/runtime, edit these numbers and re-label rows as needed. - Best: < 1 us - Fast: < 2 us - Good: < 4 us - Fair: < 9 us - Slow: > 9 us --- ## Benchmark environment - clk: ~3.86 GHz - cpu: Apple M3 Ultra - runtime: node 24.12.0 (arm64-darwin) --- ## Mental model (what makes things fast) **1) Header-only / "no payload"** Values fit into the call header: tiny encode/decode cost. **2) Static payload** Reuses part of the header plus a small buffer, only for small payloads. **3) Dynamic payload (allocator path)** Needs allocation, copying, and bookkeeping. Still fast, but you'll feel it in hot loops. **4) Pointer teleportation (zero-copy)** `SharedArrayBuffer` and `ProcessSharedBuffer` are never copied — only a pointer/handle crosses the boundary and both sides map the same bytes. Size stops mattering: a 1 KiB and a 64 MiB `SharedArrayBuffer` cost the same to pass. `BufferReference` (`knitting/unsafe`) is the thread-only *move* variant: it detaches the source and hands the bytes over without a copy. --- ## Single call categories (1 value) | Case | Tier | | --- | --- | | Primitives: `boolean`, `undefined`, `null` | Best | | Numbers: `number` | Best | | Time/IDs: `Date` | Best | | Strings: small `string` | Best | | Symbols: `Symbol.for` | Fast | | BigInt: small `bigint` | Best | | BigInt: large `bigint` | Fast | | Binary: typed arrays | Best | | Views: `DataView` | Good | | Structured: JSON object | Good | | Structured: JSON array | Good | | Errors: `Error` | Slow | --- ## Configuration effects on performance ### Thread count More threads means more lanes for the balancer to distribute work across, but each thread has a fixed memory cost (payload buffers, shared lock regions, abort signal pool). Returns diminish past the number of physical cores. For compute-only workloads, `threads: os.availableParallelism() - 1` is a reasonable starting point. ### Inliner The inliner skips encode/decode entirely, so header-only payloads on the inline lane are effectively free. For tiny math tasks, adding `inliner: { position: "last", batchSize: 64 }` can improve throughput noticeably. See [Inliner guide](/guides/inliner). ### Permissions Enabling `permission: "strict"` adds startup cost per worker (flag generation, lock file resolution) but has no measurable per-call overhead once workers are running. The cost is one-time and small. ### Payload sizing `payload.payloadInitialBytes`, `payload.payloadMaxByteLength`, and `payload.maxPayloadBytes` control the shared buffer each worker allocates. Larger initial buffers avoid runtime growth at the cost of upfront memory. If your payloads are consistently small (primitives, short strings), the defaults (`4 MiB` initial, `64 MiB` max length, `8 MiB` hard dynamic cap) are usually enough. --- ## Threads vs processes The two worker runtimes isolate memory differently, so they expose different zero-copy tools: | Runtime | Isolation | Zero-copy tools | | --- | --- | --- | | `thread` (default) | shares the host address space | `SharedArrayBuffer`, `BufferReference` (move) | | `process` | separate memory and permissions | `ProcessSharedBuffer` (OS shared memory) | A `SharedArrayBuffer` or `BufferReference` cannot cross a process boundary — reach for `ProcessSharedBuffer` when the worker is a process. See [Shared memory](/guides/shared-memory/) and [Buffer reference](/guides/buffer-reference/). Because a process worker is just a child process, you can spawn it behind a command prefix (`worker.processCommandPrefix`) so another tool launches it — a sandbox like `bwrap` or a container like Docker: ```ts worker: { runtime: "process", processCommandPrefix: ["bwrap", "--unshare-all", "--ro-bind", "/", "/"], } ``` See [Process workers](/guides/process-workers/) for the full wrapper recipes. > Caution: For anything security-sensitive, define the task with > [`importTask`](/guides/defining-tasks/#importing-worker-side-code-with-importtask). > The host holds only a typed wrapper and **never imports or evaluates the > module** — that code runs only inside the worker, under the worker's > permissions, never at host scope. --- ## HTTP request handling `call.*()` accepts `Promise` inputs, not just already-resolved values. In an HTTP handler this lets you forward the request body promise straight into a task without stopping the request thread to materialize it first: ```ts app.post("/jwt", async (c) => { const responseJson = await handlers.call.issueJwt(c.req.arrayBuffer()); return c.body(responseJson ?? "Bad request", responseJson ? 200 : 400, { "content-type": "application/json; charset=utf-8", }); }); ``` The promise isn't faster on its own — the win is that the request thread never stops to materialize the body before handing it to Knitting: - `c.req.arrayBuffer()` already returns a promise, so forwarding it skips an `await` in the handler. - UTF-8 decode / JSON parsing happens in the worker, not on the request thread. - `ArrayBuffer` stays on the binary fast path. ### Head + body with `Envelope` When you need both the request metadata (the head) and the raw body, wrap them in an `Envelope`: the header carries the parsed metadata, the payload carries the body bytes. Shape it from the body promise with `.then(...)` so you still never await the body on the request thread: ```ts import { Envelope } from "knitting"; app.post("/upload", async (c) => { const result = await handlers.call.storeUpload( c.req.arrayBuffer().then( (body) => new Envelope( { contentType: c.req.header("content-type") ?? "application/octet-stream" }, body, ), ), ); return c.json(result); }); ``` For a large binary body sent to a **thread** worker, make the body a `BufferReference` instead of an `ArrayBuffer` to move the bytes with no copy — only the body line changes: ```ts import { BufferReference } from "knitting/unsafe"; new Envelope( { contentType: c.req.header("content-type") ?? "application/octet-stream" }, new BufferReference(body), // moves the body bytes, zero-copy (thread workers) ); ``` This is most useful when the route is basically a transport layer and the worker owns parsing anyway, like SSR or JWT issuance. If you need to inspect the body on the main thread before dispatch, await it locally and validate there. ### Choosing a return type The return path has the same costs as the input path, in reverse. There are many ways to return a result; pick by how big it is and what it already is: - **JSON object / array** — serialized on the worker and parsed again on the host, so it pays a *double* pass over the data (stringify + parse). Fine for small results, heavy for large ones. - **`SharedArrayBuffer` / `ProcessSharedBuffer`** — pointer teleportation; the cheapest way to hand bytes back when you can back the result with shared memory. `ProcessSharedBuffer` also works from process workers. - **`BufferReference`** — for big binaries you cannot easily cast into a `SharedArrayBuffer` (for example a buffer produced by a library you don't control). Zero-copy on Node; a single copy on Deno and Bun. Thread workers only. See [Payloads](/guides/payloads/) and [Buffer reference](/guides/buffer-reference/) for the full type list. --- # Inliner URL: https://knittingdocs.netlify.app/guides/inliner/ Run pure compute on the host thread as an extra lane. The `inliner` option adds the host (main thread) as one extra lane in the pool. Instead of sending everything to workers, the host participates in execution too -- useful when every task is pure computation and you want to squeeze one more core out of the machine. ```ts const pool = createPool({ threads: 4, inliner: { position: "last", batchSize: 16 }, })({ add }); ``` > Note: The inliner is a lane, not a replacement for worker threads. > Use it to complement workers, not to avoid them. ## When to use it The inliner shines for **math and pure-compute workloads** that run on the host without touching the network or filesystem: - Number crunching, scoring, hashing, matrix ops. - Batch transforms over arrays of primitives. - Short, synchronous functions where IPC overhead matters more than the work itself. - Bursty queues where one extra lane helps drain work faster. Because inline tasks skip worker IPC entirely (no encode/decode round-trip), they can be significantly faster for tiny payloads. ## When NOT to use it - **HTTP / networking** -- inline tasks run on the main thread, so any I/O blocks the event loop and defeats the isolation that workers provide. If you need request handling, keep it in workers. - **File system or database calls** -- same problem. Anything that awaits external I/O will stall timers, sockets, and other pools sharing the host. - **Long-running async work** -- the inliner is designed around fast, ideally synchronous functions. Async tasks that take tens of milliseconds or more will block the batch loop and starve other inline slots. - **Isolation-sensitive code** -- if a task can throw or corrupt shared state, run it in a worker where a crash stays contained. > Warning: Inline tasks run on the main thread. Anything that blocks -- network calls, disk > reads, heavy async chains -- will freeze the event loop. Stick to pure math > and transforms. ## How it runs Inline execution is not immediate in the `call.*()` path. Calls are queued, then processed when the macro-queue turn runs. This delay is intentional: by that point, the dispatcher has had a chance to send/receive worker tasks first, then the host drains inline work. `batchSize` controls how many inline tasks run per macro-queue turn. For compute workloads, higher values let the host churn through more work per tick without yielding back to the event loop unnecessarily. ## Options ```ts createPool({ threads: number, inliner: { position?: "first" | "last", batchSize?: number, dispatchThreshold?: number, }, balancer?: "roundRobin" | "robinRound" | "firstIdle" | "randomLane" | "firstIdleOrRandom", }) ``` ### position Controls where the inline lane sits relative to worker lanes. - `"first"` -- host lane is considered before workers. Good when inline work is cheaper than IPC and you want the host to grab tasks first. - `"last"` -- host lane is considered after workers. Workers get priority; the host only picks up overflow. For most compute pools, `"last"` is the safe default. ### batchSize How many inline tasks are processed per macro-queue turn. - Higher values = better throughput for pure math (fewer yields to the event loop). - Lower values = more responsive host (other timers and callbacks get a chance to run between batches). Defaults to `1` when the inliner is enabled. For compute-heavy pools you typically want a much higher value (16, 64, 128+). ### dispatchThreshold Minimum in-flight calls before the inline lane becomes eligible for scheduling. - `1` (default) -- inline lane is immediately eligible. - Higher values -- host lane stays excluded until concurrency rises past the threshold, then joins to help drain the burst. This is a pressure-relief valve: at low concurrency, workers handle everything; once a burst builds up, the host pitches in. ## Exact behavior - The scheduler tracks `inFlight` calls per task invoker. - On each call, `inFlight` increments before lane selection. - If `inFlight < dispatchThreshold`, scheduling uses worker-only lanes (inline lane excluded). - If `inFlight >= dispatchThreshold`, scheduling uses all lanes (workers + inline lane). - `inFlight` decrements on resolve, reject, or synchronous throw. - The configured balancer strategy applies to whichever lane set is currently active. ## Internals The inline executor uses typed arrays (`Int32Array`, `Int8Array`) to manage execution slots and a `RingQueue` for pending work. It coordinates with the event loop through a `MessageChannel` (macro-task boundary) and `queueMicrotask` (micro-task fast path). The first dispatch in a burst resolves in microtasks; overflow beyond `batchSize` defers to the next macro-task turn. Promise arguments are awaited before execution (unlike thenables, which are passed through as-is). Timeout specs from `task()` are applied via a `Promise.race` wrapper only when the task returns a `Promise`. Abort signals on inline tasks use a static toolkit where `hasAborted()` always returns `false` -- inline tasks cannot be individually aborted since they share the host thread. ## Balancer guidance - `roundRobin` -- simple rotation across all lanes. Works well when tasks are uniform. - `robinRound` -- legacy alias of `roundRobin`. - `firstIdle` -- picks the first idle lane. Prioritizes workers when `position: "last"`. - `firstIdleOrRandom` or `randomLane` -- useful for pools with many registered tasks or uneven load. ## Examples ### Math pipeline High batch size, host joins after workers: ```ts using pool = createPool({ threads: 4, inliner: { position: "last", batchSize: 64 }, balancer: "firstIdleOrRandom", })({ scoreChunk }); const results = await Promise.all( chunks.map((chunk) => pool.call.scoreChunk(chunk)), ); ``` ### Burst drain with threshold Host stays out until concurrency spikes: ```ts using pool = createPool({ threads: 2, inliner: { position: "last", batchSize: 32, dispatchThreshold: 16, }, balancer: "roundRobin", })({ hash }); ``` ### Single-thread + inliner Useful when you want one worker for isolation but the host can handle the easy math too: ```ts using pool = createPool({ threads: 1, inliner: { position: "first", batchSize: 8 }, balancer: "roundRobin", })({ add }); ``` --- # Examples URL: https://knittingdocs.netlify.app/examples/intro_examples/ Practical, copyable patterns for using Knitting in real workloads. These examples are real patterns you can copy into your project -- not toy snippets. Most pages also point to an optional host-vs-worker benchmark script, but the example code is the main thing to copy. The Hono server page is the one that keeps the fuller performance story. ## Pick the right example for your use case **"I have a web server and want to keep it responsive under load"** Start with [Hono server routes](/examples/data_transforms/rendering_output/hono_server). It's the most realistic example -- a real HTTP server with SSR, JWT, and health check routes. If your server handles file uploads, [Image processing API](/examples/data_transforms/rendering_output/image_processing) shows the same pattern with binary payloads (resize, thumbnail, watermark via sharp). **"I do SSR and want to offload rendering"** [React SSR](/examples/data_transforms/rendering_output/react_ssr) shows the basic pattern. [React SSR compression](/examples/data_transforms/rendering_output/react_ssr_compress) adds Brotli and tests where compression should live (worker vs host). **"I need to validate or parse lots of data"** [Schema validation](/examples/data_transforms/validation/schema_validate) (Zod), [JWT revalidation](/examples/data_transforms/validation/jwt_revalidation) (Web Crypto), or [Salt hashing](/examples/data_transforms/validation/salt_hashing) (PBKDF2) -- pick whichever is closest to your workload. **"I'm building an LLM-powered app"** [Prompt token budgeting](/examples/data_transforms/validation/prompt_token_budgeting) trims prompts to fit a token budget before they hit the API. **"I have a CPU-heavy computation I want to parallelize"** The math examples cover the spectrum: [Monte Carlo pi](/examples/maths/monte_pi) (embarrassingly parallel), [Physics loop](/examples/maths/physics_loop) (variable-work simulation), [Big prime](/examples/maths/big_prime) (long-running search), and [TSP](/examples/maths/tsp_gsa) (NP-hard optimization with parallel restarts). **"I just want to convert some text"** [Markdown to HTML](/examples/data_transforms/rendering_output/markdown_to_html) is the simplest rendering example -- string in, compressed string out. ## All examples ### Math and simulation - [Big prime](/examples/maths/big_prime) -- long-running BigInt search with Miller-Rabin - [Monte Carlo pi](/examples/maths/monte_pi) -- independent sampling and reduction - [Physics loop](/examples/maths/physics_loop) -- branch-heavy simulation with variable work per trial - [TSP (GSA)](/examples/maths/tsp_gsa) -- parallel heuristic restarts for NP-hard optimization ### Data transforms - [Data transforms overview](/examples/data_transforms/intro_data_transforms) -- grouped by validation and rendering - [Schema validation](/examples/data_transforms/validation/schema_validate) -- Zod validation on workers - [JWT revalidation](/examples/data_transforms/validation/jwt_revalidation) -- HMAC verify + renewal with Web Crypto - [Salt hashing](/examples/data_transforms/validation/salt_hashing) -- PBKDF2 password hashing (heaviest per-call workload) - [Prompt token budgeting](/examples/data_transforms/validation/prompt_token_budgeting) -- trim LLM prompts to a token budget - [React SSR](/examples/data_transforms/rendering_output/react_ssr) -- render React components on workers - [React SSR compression](/examples/data_transforms/rendering_output/react_ssr_compress) -- SSR + Brotli, worker vs host comparison - [Hono server routes](/examples/data_transforms/rendering_output/hono_server) -- real HTTP server with offloaded routes - [Image processing API](/examples/data_transforms/rendering_output/image_processing) -- resize, thumbnail, and watermark with sharp on workers - [Markdown to HTML](/examples/data_transforms/rendering_output/markdown_to_html) -- simple markdown transform pipeline ## Patterns worth copying 1. **Keep worker tasks focused and deterministic.** One task, one job, predictable output. 2. **Batch calls and await them together** (`Promise.all`) to reduce scheduling overhead. 3. **Return compact summaries from workers** instead of large raw outputs. 4. **Validate on the host.** Recompute critical metrics when needed -- don't trust worker output blindly. 5. **Compare against a host-only baseline** before claiming speedups. 6. **Tune chunk sizes** based on throughput -- there's always a sweet spot between dispatch overhead and load balance. --- # Data transforms URL: https://knittingdocs.netlify.app/examples/data_transforms/intro_data_transforms/ Validation, rendering, and output examples -- pick the one closest to your workload. These examples use real payload-carrying workloads on purpose. Primitive-only tasks are often near-instant, which hides coordination and transfer costs. Payload transforms are closer to production traffic, so the host-vs-worker comparisons are more honest. Treat the benchmark blocks in this section as supporting material, not the main event. Most pages are now organized around the runnable example first, with a benchmark command attached if you want to measure the same pattern on your machine. If you want one page that keeps the full performance discussion, start with [Hono server routes](/examples/data_transforms/rendering_output/hono_server). ## Which example should you start with? **If you're validating incoming data** -- API payloads, form submissions, webhook bodies -- start with [Schema validation](/examples/data_transforms/validation/schema_validate). It's the simplest validation pattern and uses Zod, which you're probably already familiar with. **If you're doing auth work** -- token verification, renewal, signature checking -- [JWT revalidation](/examples/data_transforms/validation/jwt_revalidation) shows how to offload HMAC crypto to workers using built-in Web Crypto (no external JWT library). **If you're hashing passwords** -- [Salt hashing](/examples/data_transforms/validation/salt_hashing) is the heaviest per-call example. PBKDF2 is intentionally slow, making it the ideal candidate for offloading. **If you're calling an LLM** -- [Prompt token budgeting](/examples/data_transforms/validation/prompt_token_budgeting) trims prompts to fit a token budget. The budgeting itself is model-agnostic. **If you're rendering HTML** -- [React SSR](/examples/data_transforms/rendering_output/react_ssr) for basic server rendering, [React SSR compression](/examples/data_transforms/rendering_output/react_ssr_compress) to also test Brotli placement, or [Markdown to HTML](/examples/data_transforms/rendering_output/markdown_to_html) for a simpler pipeline without React. **If you handle file uploads** -- [Image processing API](/examples/data_transforms/rendering_output/image_processing) offloads resize, thumbnail, and watermark operations to workers using sharp. It's the binary payload counterpart to the Hono SSR example. **If you want the full picture** -- [Hono server routes](/examples/data_transforms/rendering_output/hono_server) is a real HTTP server that combines SSR, JWT, and health checks. It's the closest to a production setup. Parse incoming strings, validate shape and types, return typed output. [Schema validation](/examples/data_transforms/validation/schema_validate), [JWT revalidation](/examples/data_transforms/validation/jwt_revalidation), [Salt hashing](/examples/data_transforms/validation/salt_hashing), [Prompt token budgeting](/examples/data_transforms/validation/prompt_token_budgeting). Take validated input and produce final output formats. [Hono server routes](/examples/data_transforms/rendering_output/hono_server), [Image processing API](/examples/data_transforms/rendering_output/image_processing), [React SSR](/examples/data_transforms/rendering_output/react_ssr), [React SSR compression](/examples/data_transforms/rendering_output/react_ssr_compress), [Markdown to HTML](/examples/data_transforms/rendering_output/markdown_to_html). ## What all these examples have in common - Each includes a host-only baseline so performance comparisons are apples-to-apples. - Worker tasks return compact results (counts, compressed buffers, status flags) -- not large raw payloads. - The same code runs in both host and worker mode, so you can compare behavior without changing logic. --- # Big prime URL: https://knittingdocs.netlify.app/examples/maths/big_prime/ Parallel prime search through large integers with Miller-Rabin Scans through large integers (default: 1500-bit) looking for probable primes using Miller-Rabin, distributed across Knitting workers. This is a long-running compute pipeline -- it scans windows of candidates, prints progress, and keeps going until you stop it. ## How it works 1. The host picks a starting odd integer in the chosen bit-width. 2. It scans **windows** of 10,000,000 odd candidates, printing progress after each window. 3. Each window is split across threads -- threads scan disjoint candidate sequences (no overlap, no duplicated work). 4. Each worker runs `candidate -> Miller-Rabin -> next candidate -> ...` in a tight loop. 5. If any worker finds a probable prime, it reports it for that window. 6. The process runs continuously until Ctrl+C. The scanning uses an interleaved stride pattern: thread 0 checks `start + 0`, `start + 2T`, `start + 4T`, ..., thread 1 checks `start + 2`, `start + 2 + 2T`, ..., and so on. This guarantees full coverage with zero overlap. ## Run Expected output: ``` starting at 1500-bit odd integer (4 threads, 8 MR rounds) window 1: scanned 10,000,000 candidates in 4.2s -- no prime found window 2: scanned 10,000,000 candidates in 4.1s -- no prime found window 3: scanned 10,000,000 candidates in 4.3s -- OK probable prime found! digits: 452 hex prefix: 0xA3F7... ``` > Note: At 1500 bits, primes are sparse -- expect to scan through many windows. Lower `--bits` to 65 or 128 for faster results while testing. ## Code ```ts import { createPool, isMain } from "knitting"; import { scanForProbablePrime } from "./prime_scan.ts"; function intArg(name: string, fallback: number) { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const v = Number(process.argv[i + 1]); if (Number.isFinite(v) && v > 0) return Math.floor(v); } return fallback; } const THREADS = intArg("threads", 4); const BITS = intArg("bits", 1500); const WINDOW = intArg("window", 10_000_000); const CHUNK = intArg("chunk", 500_000); const ROUNDS = intArg("rounds", 10); function xorshift32(s: number): number { s |= 0; s ^= s << 13; s ^= s >>> 17; s ^= s << 5; return s | 0; } function makeRandomOdd(bits: number, seed: number): bigint { // Build a BigInt from 3x 32-bit chunks, mask to bits, set top bit, make odd. let s = seed | 0; let x = 0n; for (let k = 0; k < 3; k++) { s = xorshift32(s); x = (x << 32n) | BigInt(s >>> 0); } const mask = (1n << BigInt(bits)) - 1n; x &= mask; x |= 1n << BigInt(bits - 1); x |= 1n; return x; } const seedBase = (Date.now() | 0) ^ 0x9e3779b9; let windowStartOdd = makeRandomOdd(BITS, seedBase); const { call, shutdown } = createPool({ threads: THREADS, balancer: "firstIdle", })({ scanForProbablePrime }); let stopping = false; process.on("SIGINT", () => { if (stopping) return; stopping = true; console.log("\nCtrl+C received. Shutting down..."); shutdown(); process.exit(0); }); function splitCounts(total: number, parts: number): number[] { const base = Math.floor(total / parts); const rem = total % parts; const out = new Array(parts); for (let i = 0; i < parts; i++) out[i] = base + (i < rem ? 1 : 0); return out; } async function scanOneWindow(): Promise< { hit: string | null; tested: number } > { // We interleave odds across threads: thread i tests start+2i, start+2i+2T, ... const stepNum = 2 * THREADS; // Divide WINDOW across threads, and within each thread further divide into CHUNK-sized tasks. const perThread = splitCounts(WINDOW, THREADS); let bestHit: string | null = null; let tested = 0; // For each thread, we run sequential “subtasks” so each thread covers its share of WINDOW. // But all threads run in parallel each wave. const subTasksPerThread = perThread.map((c) => Math.ceil(c / CHUNK)); const maxSubs = Math.max(...subTasksPerThread); for (let sub = 0; sub < maxSubs; sub++) { const jobs: Promise<[number, string, number]>[] = []; for (let t = 0; t < THREADS; t++) { const threadTotal = perThread[t]; const startAt = sub * CHUNK; if (startAt >= threadTotal) continue; const count = Math.min(CHUNK, threadTotal - startAt); // offset in "odd steps": 2*t + 2*THREADS*startAt const offsetNum = 2 * t + stepNum * startAt; jobs.push( call.scanForProbablePrime([ windowStartOdd.toString(), count, stepNum, offsetNum, ROUNDS, ]), ); tested += count; } const results = await Promise.all(jobs); // If any job found a hit, keep the smallest hit (nice for consistency) for (const [found, primeStr] of results) { if (found) { if (bestHit === null) bestHit = primeStr; else { // compare as BigInt safely const a = BigInt(bestHit); const b = BigInt(primeStr); if (b < a) bestHit = primeStr; } } } } return { hit: bestHit, tested }; } async function main() { console.log("Prime hunt (probable primes via Miller–Rabin)"); console.log( "threads:", THREADS, "bits:", BITS, "window:", WINDOW.toLocaleString(), "chunk:", CHUNK.toLocaleString(), "rounds:", ROUNDS, ); console.log("start :", windowStartOdd.toString()); console.log("mode : infinite windows (Ctrl+C to stop)"); let windowsDone = 0; let totalTested = 0n; while (true) { const { hit, tested } = await scanOneWindow(); windowsDone++; totalTested += BigInt(tested); if (hit) { console.log( `[window ${windowsDone}] +${tested.toLocaleString()} tested (total ${totalTested.toString()}) | HIT: ${hit}`, ); } else { console.log( `[window ${windowsDone}] +${tested.toLocaleString()} tested (total ${totalTested.toString()}) | no hit (Ctrl+C to stop)`, ); } // Move start forward by WINDOW odd candidates (i.e., +2*WINDOW) windowStartOdd += 2n * BigInt(WINDOW); } } if (isMain) { main().finally(shutdown); } ``` ```ts import { task } from "knitting"; /** * Payload-safe: * - args: strings + numbers only * - return: numbers + strings only * This avoids any accidental BigInt/number mixing at the transport boundary. */ // args: [startOddStr, count, stepNum, offsetNum, rounds] export const scanForProbablePrime = task< [string, number, number, number, number], // ret: [found(0/1), primeStrOrEmpty, tested] [number, string, number] >({ f: ([startOddStr, count, stepNum, offsetNum, rounds]) => { // Convert once, keep everything BigInt inside. let x = BigInt(startOddStr) + BigInt(offsetNum); if ((x & 1n) === 0n) x += 1n; const step = BigInt(stepNum); const rds = rounds | 0; for (let i = 0; i < count; i++) { if (isProbablePrime(x, rds)) return [1, x.toString(), i + 1]; x += step; } return [0, "", count]; }, }); function modPow(base: bigint, exp: bigint, mod: bigint): bigint { let r = 1n; let b = base % mod; let e = exp; // must be bigint while (e > 0n) { if ((e & 1n) === 1n) r = (r * b) % mod; e >>= 1n; if (e) b = (b * b) % mod; } return r; } const small = [3n, 5n, 7n, 11n, 13n, 17n, 19n, 23n, 29n, 31n, 37n]; const bases = [2n, 325n, 9375n, 28178n, 450775n, 9780504n, 1795265022n]; function isProbablePrime(n: bigint, rounds: number): boolean { if (n < 2n) return false; if (n === 2n || n === 3n) return true; if ((n & 1n) === 0n) return false; // quick small-prime filter for (const p of small) { if (n === p) return true; if (n % p === 0n) return false; } // n-1 = d * 2^s let d = n - 1n; let s = 0; while ((d & 1n) === 0n) { d >>= 1n; s++; } // good practical bases (still "probable prime" for 65-bit+) const rds = rounds | 0; for (let i = 0; i < rds; i++) { const a = (bases[i % bases.length] % (n - 3n)) + 2n; // [2, n-2] let x = modPow(a, d, n); if (x === 1n || x === n - 1n) continue; let composite = true; for (let r = 1; r < s; r++) { x = (x * x) % n; if (x === n - 1n) { composite = false; break; } } if (composite) return false; } return true; } ``` ## The math behind it **Why primes are findable:** Around a number of size N, prime density is roughly `1/ln(N)`. Primes get rarer as numbers get bigger, but not impossibly rare -- even at 1500 bits, you'll find them. **Probable vs proven primes:** Miller-Rabin is a probabilistic test. With 8 rounds, the false-positive rate is vanishingly small (less than `4^-8`). In practice, this is what cryptographic libraries use for key generation. **Tuning `--rounds`:** More rounds = higher confidence, but more compute per candidate. 8 rounds is a solid default. You can go higher if you need cryptographic-grade confidence. ## CLI knobs - `--bits` -- bit-width of candidates (higher = harder, sparser primes) - `--threads` -- worker count - `--total` -- candidates per window (controls progress granularity) - `--chunk` -- candidates per task (controls scheduling granularity) - `--rounds` -- Miller-Rabin rounds (controls confidence vs speed) ## Things to try 1. Lower `--bits` to 65 and watch primes appear frequently. 2. Increase `--rounds` to 20 and measure the throughput impact. 3. Compare different `--chunk` values -- too small and dispatch overhead dominates, too large and load balance suffers. 4. Modify the task to search for twin primes (p and p+2 both prime) or Sophie Germain primes (p and 2p+1 both prime). --- # React SSR URL: https://knittingdocs.netlify.app/examples/data_transforms/rendering_output/react_ssr/ Render React components to HTML strings on workers -- the example closest to a real server workload. Renders React components to HTML strings on workers using `renderToString`. If your server does SSR, this is probably the example closest to your real workload -- parse JSON input, normalize data, render a component, return HTML. ## How it works The host generates JSON payload strings. The host path calls `renderUserCardHost` directly (parse + normalize + `renderToString`). The worker path sends the same payloads through `createPool`. Byte totals are compared once to verify the host and worker produce identical output, then `mitata` benchmarks both paths. Three files: - `bench_react_ssr.ts` -- the benchmark itself - `render_user_card.tsx` -- the SSR component and worker task - `utils.ts` -- input payloads and normalization helpers ## Example payload and result Input: ```json { "id": "u42", "name": "Ari Lane", "handle": "@ari", "bio": "Building fast UIs.", "plan": "pro", "location": "Austin, TX", "joinedAt": "2026-01-18", "tags": ["react", "ssr", "workers"], "stats": { "posts": 42, "followers": 1200, "following": 180, "likes": 9800 }, "alerts": { "unread": 3, "lastLogin": "2026-01-18" } } ``` Minimal usage: ```ts const pool = createPool({ threads: 2 })({ renderUserCard }); const html = await pool.call.renderUserCard(payloadJson); ``` Result: ```html

``` ## Deno setup (TSX + npm) If you run this with Deno and see `Uncaught SyntaxError: Unexpected token '<'`, set a root `deno.json` so Deno transpiles TSX and resolves npm packages: ```json { "nodeModulesDir": "auto", "compilerOptions": { "jsx": "react-jsx", "jsxImportSource": "react" } } ``` ## Optional benchmark Expected output: ``` byte parity check: host=284,160 worker=284,160 OK match benchmark avg (ns) min ... max (ns) host 18,200 16,800 ... 24,500 knitting 9,400 8,600 ... 14,100 ``` > Note: The byte parity check runs once before benchmarking to confirm host and worker produce identical HTML. If it doesn't match, something is wrong with your setup. ## Code ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { renderUserCard, renderUserCardHost } from "./render_user_card.tsx"; import { buildUserPayloads } from "./utils.ts"; const THREADS = 1; const REQUESTS = 2_000; async function main() { const payloads = buildUserPayloads(REQUESTS); const pool = createPool({ threads: THREADS, inliner: { batchSize: 6, }, })({ renderUserCard }); let sink = 0; try { const hostBytes = runHost(payloads); const knittingBytes = await runWorkers(pool.call.renderUserCard, payloads); if (hostBytes !== knittingBytes) { throw new Error("Host and worker HTML byte totals differ."); } console.log("React SSR benchmark (mitata)"); console.log("workload: parse + normalize + render to HTML"); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("threads:", THREADS, " + inliner"); boxplot(() => { summary(() => { bench(`host (${REQUESTS.toLocaleString()} req)`, () => { sink = runHost(payloads); }); bench( `knitting (${THREADS} thread(s) + main , ${REQUESTS.toLocaleString()} req)`, async () => { sink = await runWorkers(pool.call.renderUserCard, payloads); }, ); }); }); await run(); console.log("last html bytes:", sink.toLocaleString()); } finally { pool.shutdown(); } } function runHost(payloads: string[]): number { let htmlBytes = 0; for (let i = 0; i < payloads.length; i++) { const html = renderUserCardHost(payloads[i]!); htmlBytes += html.length; } return htmlBytes; } async function runWorkers( callRender: (payload: string) => Promise, payloads: string[], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < payloads.length; i++) { jobs.push(callRender(payloads[i]!)); } const results = await Promise.all(jobs); let htmlBytes = 0; for (let i = 0; i < results.length; i++) { htmlBytes += results[i]!.length; } return htmlBytes; } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```tsx import React from "react"; import { renderToString } from "react-dom/server"; import { task } from "knitting"; import { clamp, engagementScore, formatJoinDate, initials, levelForScore, type NormalizedUser, normalizeUser, } from "./utils.ts"; function Stat({ label, value }: { label: string; value: number }) { return (

{value.toLocaleString()} {label}

); } function Badge({ plan }: { plan: "free" | "pro" }) { const text = plan === "pro" ? "PRO" : "FREE"; return {text}; } function UserCard({ user }: { user: NormalizedUser }) { const score = engagementScore(user.stats); const level = levelForScore(score); const joined = formatJoinDate(user.joinedAt); const profileCompleteness = (user.bio ? 30 : 0) + (user.location ? 20 : 0) + (user.tags.length ? 20 : 0) + (user.handle ? 10 : 0) + (user.stats.posts ? 20 : 0); const completeness = clamp(profileCompleteness, 10, 100); const topTags = user.tags.slice(0, 6); const achievementBadges = [ user.stats.followers >= 1000 ? "1k+ followers" : "", user.stats.likes >= 5000 ? "5k+ likes" : "", user.stats.posts >= 50 ? "50+ posts" : "", ].filter(Boolean); return (

{user.name}

{user.handle}

📍 {user.location} • Joined {joined}

Level: {level} Score {score.toLocaleString()}

{user.alerts.unread} unread Last login {user.alerts.lastLogin}

Bio

{user.bio ?

{user.bio}

No bio yet.

}

Profile completeness

{completeness}% complete

Highlights

{badge}
Getting started

Interests

{tag}
No tags

); } export function renderUserCardHost(payloadJson: string): string { const parsed = JSON.parse(payloadJson) as unknown; const user = normalizeUser(parsed); const html = renderToString(); return html; } export const renderUserCard = task({ f: renderUserCardHost, }); ``` ```ts export type UserStats = { posts: number; followers: number; following: number; likes: number; }; export type UserAlerts = { unread: number; lastLogin: string; }; export type UserPayload = { id?: string; name?: string; handle?: string; bio?: string; plan?: "free" | "pro"; location?: string; joinedAt?: string; tags?: string[]; stats?: Partial; alerts?: Partial; }; export type NormalizedUser = Required & { stats: UserStats; alerts: UserAlerts; }; const TAGS = [ "react", "ssr", "typescript", "performance", "parallel", "workers", "ui", "web", ]; const LOCATIONS = ["Austin, TX", "Seattle, WA", "Brooklyn, NY", "Denver, CO"]; function pickFrom(arr: T[], index: number): T { return arr[index % arr.length]!; } function toNumber(value: unknown, fallback: number): number { return Number.isFinite(value) ? Number(value) : fallback; } function toStringArray(value: unknown): string[] { if (!Array.isArray(value)) return []; return value.filter((item): item is string => typeof item === "string"); } export function makeUserPayloadJson(i: number): string { const short = i.toString(36); return JSON.stringify({ id: `u${short}`, name: `User ${short.toUpperCase()}`, handle: `@${short}`, bio: `Building fast UIs. Coffee + TypeScript. (${short})`, plan: i % 7 === 0 ? "pro" : "free", location: pickFrom(LOCATIONS, i), joinedAt: `202${(i % 4) + 2}-0${(i % 8) + 1}-1${i % 9}`, tags: [ pickFrom(TAGS, i), pickFrom(TAGS, i + 1), pickFrom(TAGS, i + 2), pickFrom(TAGS, i + 3), ], stats: { posts: (i % 120) + 1, followers: (i * 13) % 50_000, following: (i * 7) % 5_000, likes: (i * 31) % 250_000, }, alerts: { unread: i % 25, lastLogin: `2026-0${(i % 8) + 1}-0${(i % 9) + 1}`, }, }); } export function buildUserPayloads(count: number): string[] { const payloads = new Array(count); for (let i = 0; i < count; i++) payloads[i] = makeUserPayloadJson(i); return payloads; } export function normalizeUser(payload: unknown): NormalizedUser { const obj = (payload ?? {}) as Record; const id = typeof obj.id === "string" ? obj.id : "unknown"; const name = typeof obj.name === "string" ? obj.name : "Anonymous"; const handle = typeof obj.handle === "string" ? obj.handle : `@${id}`; const bio = typeof obj.bio === "string" ? obj.bio : ""; const plan = obj.plan === "pro" ? "pro" : "free"; const location = typeof obj.location === "string" ? obj.location : "Unknown"; const joinedAt = typeof obj.joinedAt === "string" ? obj.joinedAt : "2024-05-01"; const statsRaw = (obj.stats ?? {}) as Record; const alertsRaw = (obj.alerts ?? {}) as Record; const stats: UserStats = { posts: toNumber(statsRaw.posts, 0), followers: toNumber(statsRaw.followers, 0), following: toNumber(statsRaw.following, 0), likes: toNumber(statsRaw.likes, 0), }; const alerts: UserAlerts = { unread: toNumber(alertsRaw.unread, 0), lastLogin: typeof alertsRaw.lastLogin === "string" ? alertsRaw.lastLogin : "2026-01-18", }; const tags = toStringArray(obj.tags); return { id, name, handle, bio, plan, location, joinedAt, tags, stats, alerts, }; } export function formatJoinDate(value: string): string { const date = new Date(value); if (Number.isNaN(date.getTime())) return value; return date.toLocaleDateString("en-US", { month: "short", year: "numeric" }); } export function initials(name: string): string { const parts = name.trim().split(/\s+/); const first = parts[0]?.[0] ?? "U"; const second = parts.length > 1 ? parts[1]?.[0] ?? "" : ""; return (first + second).toUpperCase(); } export function clamp(value: number, min: number, max: number): number { return Math.min(max, Math.max(min, value)); } export function engagementScore(stats: UserStats): number { return Math.round( stats.posts * 2 + stats.likes * 0.05 + stats.followers * 0.4 + stats.following * 0.1, ); } export function levelForScore(score: number): string { if (score >= 5000) return "Legend"; if (score >= 2500) return "Elite"; if (score >= 1000) return "Rising"; if (score >= 300) return "Active"; return "New"; } ``` ## Why SSR is a great fit for workers `renderToString` is synchronous and CPU-bound -- it walks your component tree and builds an HTML string. On a request-per-request basis it's not usually slow, but under load it blocks the event loop and everything else waits. Moving SSR to a worker pool means your main thread stays free to accept requests and handle I/O while rendering happens in parallel. The Hono server example shows what this looks like in a real HTTP server context. --- # Schema validate URL: https://knittingdocs.netlify.app/examples/data_transforms/validation/schema_validate/ Validate JSON payloads against a Zod schema on workers. Parses JSON strings, validates them against a Zod schema, and returns typed results -- either on the main thread or through a Knitting worker pool. If your app already does `JSON.parse` + validation and you want to see what offloading looks like, start here. ## How it works The host generates JSON strings (some valid, some intentionally broken). Each job parses the string, runs it through `UserSchema.safeParse`, and returns `{ ok: true, value }` or `{ ok: false, issues }`. The host aggregates counts and prints sample failures. Three files: - `schema_knitting.ts` -- runs parse+validate in host and Knitting modes - `utils.ts` -- schema logic, payload builders, task exports - `bench_schema_validate.ts` -- host-vs-worker benchmark with `mitata` ## Example payloads ```json // valid { "id": "u_42", "email": "ari@knitting.dev", "displayName": "Ari Lane", "age": 29, "roles": ["admin"], "marketingOptIn": true } // invalid { "id": "u_42", "email": "ari@knitting.dev", "displayName": "x", "age": "unknown", "roles": ["owner"] } ``` The first returns `{ ok: true, value }`. The second returns `{ ok: false, issues }`, with messages like `displayName: String must contain at least 2 character(s)` or `age: Expected number, received string`. ## Run You should see output like: ``` -- host mode -- valid: 800 invalid: 200 sample issue: ["Expected string, received number"] -- knitting mode (2 threads) -- valid: 800 invalid: 200 sample issue: ["Expected string, received number"] ``` ## Optional benchmark The benchmark compares `JSON.parse + safeParse` via direct function imports (`host`) against the same logic dispatched through a worker pool (`knitting`). Batch calls keep per-dispatch overhead predictable. Expected output: ``` benchmark avg (ns) min ... max (ns) host 12,340 11,200 ... 18,400 knitting 6,890 6,100 ... 11,200 ``` > Note: Exact numbers depend on your hardware and batch size. The shape of the result -- workers winning on batched validation -- is what matters. ## Code ```ts import { createPool, isMain } from "knitting"; import { buildPayloads, parseAndValidate, parseAndValidateHost, type ParseValidateResult, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 20_000; const INVALID_PERCENT = 15; type Summary = { valid: number; invalid: number; sampleIssues: string[]; }; function summarize(results: ParseValidateResult[]): Summary { let valid = 0; let invalid = 0; const sampleIssues: string[] = []; for (let i = 0; i < results.length; i++) { const result = results[i]!; if (result.ok) { valid++; continue; } invalid++; if (sampleIssues.length < 3 && result.issues.length > 0) { sampleIssues.push(result.issues[0]!); } } return { valid, invalid, sampleIssues }; } function runHost(payloads: string[]): Summary { const results = payloads.map((payload) => parseAndValidateHost(payload)); return summarize(results); } async function runWorkers(payloads: string[]): Promise

{ const pool = createPool({ threads: THREADS })({ parseAndValidate }); try { const jobs: Promise[] = []; for (let i = 0; i < payloads.length; i++) { jobs.push(pool.call.parseAndValidate(payloads[i]!)); } const results = await Promise.all(jobs); return summarize(results); } finally { pool.shutdown(); } } function printSummary(mode: string, summary: Summary, ms: number): void { const secs = Math.max(1e-9, ms / 1000); const rps = REQUESTS / secs; console.log(mode); console.log("requests :", REQUESTS.toLocaleString()); console.log("invalidRate :", `${INVALID_PERCENT}%`); console.log("valid :", summary.valid.toLocaleString()); console.log("invalid :", summary.invalid.toLocaleString()); console.log("took :", `${ms.toFixed(2)} ms`); console.log("throughput :", `${rps.toFixed(0)} req/s`); if (summary.sampleIssues.length > 0) { console.log("sampleIssues:", summary.sampleIssues.join(" | ")); } } async function main() { const payloads = buildPayloads(REQUESTS, INVALID_PERCENT); const hostStart = performance.now(); const hostSummary = runHost(payloads); const hostMs = performance.now() - hostStart; const workerStart = performance.now(); const workerSummary = await runWorkers(payloads); const workerMs = performance.now() - workerStart; const uplift = (hostMs / Math.max(1e-9, workerMs) - 1) * 100; console.log("JSON parse + schema validation"); console.log(`threads: ${THREADS}`); console.log(""); printSummary("host", hostSummary, hostMs); console.log(""); printSummary("knitting", workerSummary, workerMs); console.log(""); console.log(`uplift: ${uplift.toFixed(1)}%`); } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { buildPayloads, makeBatches, mergeValidationSummary, parseAndValidateBatchFast, parseAndValidateBatchFastHost, sameValidationSummary, type ValidationSummary, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 20_000; const INVALID_PERCENT = 15; const BATCH = 64; async function main() { const payloads = buildPayloads(REQUESTS, INVALID_PERCENT); const payloadBatches = makeBatches(payloads, BATCH); const pool = createPool({ threads: THREADS, })({ parseAndValidateBatchFast }); let sink = 0; try { const hostCheck = runHostBatches(payloadBatches); const workerCheck = await runWorkerBatches( pool.call.parseAndValidateBatchFast, payloadBatches, ); if (!sameValidationSummary(hostCheck, workerCheck)) { throw new Error("Host and worker validation counts differ."); } console.log("Schema validation benchmark (mitata)"); console.log("workload: JSON.parse + UserSchema.safeParse"); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("invalid rate:", `${INVALID_PERCENT}%`); console.log("batch size:", BATCH); console.log("threads:", THREADS); boxplot(() => { summary(() => { bench(`host (${REQUESTS.toLocaleString()} req, batch ${BATCH})`, () => { const totals = runHostBatches(payloadBatches); sink = totals.valid; }); bench( `knitting (${THREADS} thread${ THREADS === 1 ? "" : "s" }, ${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runWorkerBatches( pool.call.parseAndValidateBatchFast, payloadBatches, ); sink = totals.valid; }, ); }); }); await run(); console.log("last valid count:", sink.toLocaleString()); } finally { pool.shutdown(); } } function runHostBatches(payloadBatches: string[][]): ValidationSummary { let totals: ValidationSummary = { valid: 0, invalid: 0 }; for (let i = 0; i < payloadBatches.length; i++) { totals = mergeValidationSummary( totals, parseAndValidateBatchFastHost(payloadBatches[i]!), ); } return totals; } async function runWorkerBatches( callBatch: (payloads: string[]) => Promise, payloadBatches: string[][], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < payloadBatches.length; i++) { jobs.push(callBatch(payloadBatches[i]!)); } const results = await Promise.all(jobs); let totals: ValidationSummary = { valid: 0, invalid: 0 }; for (let i = 0; i < results.length; i++) { totals = mergeValidationSummary(totals, results[i]!); } return totals; } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { task } from "knitting"; import { z } from "zod"; const UserSchema = z.object({ id: z.string().min(1), email: z.string().email(), displayName: z.string().min(2).max(80), age: z.number().int().min(13).max(120), roles: z.array(z.enum(["user", "admin", "moderator"])).default(["user"]), marketingOptIn: z.boolean().default(false), }); export type User = z.infer; export type ParseValidateResult = | { ok: true; value: User } | { ok: false; issues: string[] }; export type ValidationSummary = { valid: number; invalid: number; }; export function makeValidPayload(i: number): string { const short = i.toString(36); const role = i % 9 === 0 ? "admin" : "user"; return JSON.stringify({ id: `u_${short}`, email: `${short}@knitting.dev`, displayName: `User ${short.toUpperCase()}`, age: 18 + (i % 60), roles: [role], marketingOptIn: i % 2 === 0, }); } export function makePayload(i: number, invalidPercent: number): string { if (i % 100 >= invalidPercent) return makeValidPayload(i); switch (i % 4) { case 0: return '{"id":"broken"'; case 1: return JSON.stringify({ id: `u_${i}`, displayName: `User ${i}`, age: 33, roles: ["user"], marketingOptIn: true, }); case 2: return JSON.stringify({ id: `u_${i}`, email: `u_${i}@knitting.dev`, displayName: "x", age: "unknown", roles: ["user"], }); default: return JSON.stringify({ id: `u_${i}`, email: `u_${i}@knitting.dev`, displayName: `User ${i}`, age: 31, roles: ["owner"], }); } } export function buildPayloads(count: number, invalidPercent: number): string[] { const cappedInvalid = Math.max(0, Math.min(95, Math.floor(invalidPercent))); const size = Math.max(0, Math.floor(count)); const payloads = new Array(size); for (let i = 0; i < size; i++) payloads[i] = makePayload(i, cappedInvalid); return payloads; } export function makeBatches(values: T[], batchSize: number): T[][] { const size = Math.max(1, Math.floor(batchSize)); const batches: T[][] = []; for (let i = 0; i < values.length; i += size) { batches.push(values.slice(i, i + size)); } return batches; } export function mergeValidationSummary( a: ValidationSummary, b: ValidationSummary, ): ValidationSummary { return { valid: a.valid + b.valid, invalid: a.invalid + b.invalid, }; } export function sameValidationSummary( a: ValidationSummary, b: ValidationSummary, ): boolean { return a.valid === b.valid && a.invalid === b.invalid; } function toIssues(error: z.ZodError): string[] { return error.issues.map((issue) => { const path = issue.path.length > 0 ? issue.path.join(".") : "payload"; return `${path}: ${issue.message}`; }); } export function parseAndValidateHost(rawPayload: string): ParseValidateResult { let parsed: unknown; try { parsed = JSON.parse(rawPayload) as unknown; } catch { return { ok: false, issues: ["payload: invalid JSON string"] }; } const result = UserSchema.safeParse(parsed); if (!result.success) { return { ok: false, issues: toIssues(result.error) }; } return { ok: true, value: result.data }; } export const parseAndValidate = task({ f: parseAndValidateHost, }); export function parseAndValidateFastHost(rawPayload: string): boolean { let parsed: unknown; try { parsed = JSON.parse(rawPayload) as unknown; } catch { return false; } return UserSchema.safeParse(parsed).success; } export function parseAndValidateBatchFastHost( rawPayloads: string[], ): ValidationSummary { let valid = 0; let invalid = 0; for (let i = 0; i < rawPayloads.length; i++) { if (parseAndValidateFastHost(rawPayloads[i]!)) { valid++; } else { invalid++; } } return { valid, invalid }; } export const parseAndValidateBatchFast = task({ f: parseAndValidateBatchFastHost, }); ``` ## When to use this pattern Schema validation is a textbook case for worker offloading: each call is independent, the input/output is small, and Zod's internals are CPU-bound (type checking, error formatting). If you're validating hundreds of payloads per second -- API gateway, webhook ingestion, form processing -- batching them through a pool can free your main thread without changing any validation logic. --- # JWT revalidation URL: https://knittingdocs.netlify.app/examples/data_transforms/validation/jwt_revalidation/ Verify JWTs and optionally reissue them using Web Crypto -- no external JWT library needed. Verifies JWT signatures with HMAC SHA-256, checks expiry, and optionally reissues tokens -- all using built-in Web Crypto, no external JWT package. This example shows what it looks like to offload auth-related crypto work to a worker pool. ## How it works The host builds a batch of JWTs (mix of valid and expired). Each job verifies the signature, checks the renewal window (`renewWindowSec`, `renewUntil`), and either returns the validated claims or issues a fresh token. The result comes back as a stringified JSON response -- keeping structured-clone overhead low. Three files: - `jwt_knitting.ts` -- runs revalidation in host and Knitting modes - `utils.ts` -- token verification, renewal logic, task exports - `bench_jwt_revalidation.ts` -- host-vs-worker benchmark with `mitata` ## Example request and response Input to the task is a JSON string like this: ```json { "token": "", "nowSec": 1767225600, "ttlSec": 180, "renewWindowSec": 30 } ``` The task returns a JSON string. A successful non-renewed response looks like: ```json { "ok": true, "renewed": false, "token": "", "sub": "user_42", "sid": "session_42", "exp": 1767225750, "canRenew": true } ``` If the token is near expiry but still renewable, the same shape comes back with `renewed: true` and a replacement `token`. Uses built-in Web Crypto APIs -- no extra JWT package required. ## Run Expected output: ``` -- host mode -- verified: 900 renewed: 80 rejected: 20 -- knitting mode (2 threads) -- verified: 900 renewed: 80 rejected: 20 ``` ## Optional benchmark Compares verify + optional renewal + `JSON.stringify` via direct imports (`host`) against the same workload through worker task calls (`knitting`). Batched dispatch keeps the comparison stable. > Note: Both paths use the same logic. The only variable is whether execution happens on the main thread or in a worker. ## Code ```ts import { createPool, isMain } from "knitting"; import { buildDemoRevalidateRequests, type RenewalSummary, revalidateToken, revalidateTokenHost, summarizeJsonResponses, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 25_000; const INVALID_PERCENT = 10; async function runHost(rawRequests: string[]): Promise { const outputs = new Array(rawRequests.length); for (let i = 0; i < rawRequests.length; i++) { outputs[i] = await revalidateTokenHost(rawRequests[i]!); } return summarizeJsonResponses(outputs); } async function runWorkers(rawRequests: string[]): Promise { const pool = createPool({ threads: THREADS })({ revalidateToken }); try { const jobs: Promise[] = []; for (let i = 0; i < rawRequests.length; i++) { jobs.push(pool.call.revalidateToken(rawRequests[i]!)); } const outputs = await Promise.all(jobs); return summarizeJsonResponses(outputs); } finally { pool.shutdown(); } } function printSummary(mode: string, totals: RenewalSummary, ms: number): void { const seconds = Math.max(1e-9, ms / 1000); const rps = REQUESTS / seconds; console.log(mode); console.log("requests :", REQUESTS.toLocaleString()); console.log("invalid rate :", `${INVALID_PERCENT}%`); console.log("accepted :", totals.ok.toLocaleString()); console.log("renewed :", totals.renewed.toLocaleString()); console.log("rejected :", totals.rejected.toLocaleString()); console.log("output bytes :", totals.outputBytes.toLocaleString()); console.log("took :", `${ms.toFixed(2)} ms`); console.log("throughput :", `${rps.toFixed(0)} req/s`); } async function main() { const rawRequests = await buildDemoRevalidateRequests({ count: REQUESTS, invalidPercent: INVALID_PERCENT, }); const hostStart = performance.now(); const hostTotals = await runHost(rawRequests); const hostMs = performance.now() - hostStart; const workerStart = performance.now(); const workerTotals = await runWorkers(rawRequests); const workerMs = performance.now() - workerStart; const uplift = (hostMs / Math.max(1e-9, workerMs) - 1) * 100; console.log("JWT token revalidation"); console.log(`threads: ${THREADS}`); console.log(""); printSummary("host", hostTotals, hostMs); console.log(""); printSummary("knitting", workerTotals, workerMs); console.log(""); console.log(`uplift: ${uplift.toFixed(1)}%`); } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { buildDemoRevalidateRequests, makeBatches, mergeRenewalSummary, type RenewalSummary, revalidateTokenBatchFast, revalidateTokenBatchFastHost, sameRenewalSummary, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 25_000; const INVALID_PERCENT = 10; const BATCH = 64; async function runHostBatches(rawBatches: string[][]): Promise { let totals: RenewalSummary = { ok: 0, renewed: 0, rejected: 0, outputBytes: 0, }; for (let i = 0; i < rawBatches.length; i++) { totals = mergeRenewalSummary( totals, await revalidateTokenBatchFastHost(rawBatches[i]!), ); } return totals; } async function runWorkerBatches( callBatch: (rawRequests: string[]) => Promise, rawBatches: string[][], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < rawBatches.length; i++) { jobs.push(callBatch(rawBatches[i]!)); } const results = await Promise.all(jobs); let totals: RenewalSummary = { ok: 0, renewed: 0, rejected: 0, outputBytes: 0, }; for (let i = 0; i < results.length; i++) { totals = mergeRenewalSummary(totals, results[i]!); } return totals; } async function main() { const rawRequests = await buildDemoRevalidateRequests({ count: REQUESTS, invalidPercent: INVALID_PERCENT, }); const rawBatches = makeBatches(rawRequests, BATCH); const pool = createPool({ threads: THREADS })({ revalidateTokenBatchFast }); let sink = 0; try { const hostCheck = await runHostBatches(rawBatches); const workerCheck = await runWorkerBatches( pool.call.revalidateTokenBatchFast, rawBatches, ); if (!sameRenewalSummary(hostCheck, workerCheck)) { throw new Error("Host and worker JWT summaries differ."); } console.log("JWT revalidation benchmark (mitata)"); console.log( "workload: verify token -> renew when allowed -> JSON.stringify", ); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("invalid rate:", `${INVALID_PERCENT}%`); console.log("batch size:", BATCH); console.log("threads:", THREADS); boxplot(() => { summary(() => { bench( `host (${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runHostBatches(rawBatches); sink = totals.outputBytes; }, ); bench( `knitting (${THREADS} thread${ THREADS === 1 ? "" : "s" }, ${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runWorkerBatches( pool.call.revalidateTokenBatchFast, rawBatches, ); sink = totals.outputBytes; }, ); }); }); await run(); console.log("last output bytes:", sink.toLocaleString()); } finally { pool.shutdown(); } } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { task } from "knitting"; export const DEMO_JWT_SECRET = "knitting-docs-demo-hs256-secret"; const DEFAULT_TTL_SEC = 180; const DEFAULT_RENEW_WINDOW_SEC = 30; const MIN_TTL_SEC = 10; const MAX_TTL_SEC = 86_400; const MAX_RENEW_WINDOW_SEC = 3_600; const encoder = new TextEncoder(); const decoder = new TextDecoder(); const keyCache = new Map>(); type JwtHeader = { alg: "HS256"; typ: "JWT"; }; export type JwtClaims = { sub: string; sid: string; scope: string[]; iat: number; exp: number; renewUntil: number; }; export type RevalidateRequest = { token: string; nowSec?: number; ttlSec?: number; renewWindowSec?: number; }; export type RevalidateResponse = | { ok: true; renewed: boolean; token: string; sub: string; sid: string; exp: number; canRenew: boolean; } | { ok: false; renewed: false; reason: string; }; export type RenewalSummary = { ok: number; renewed: number; rejected: number; outputBytes: number; }; export type DemoRequestOptions = { count: number; nowSec?: number; invalidPercent?: number; ttlSec?: number; renewWindowSec?: number; }; export function makeBatches(values: T[], batchSize: number): T[][] { const size = Math.max(1, Math.floor(batchSize)); const batches: T[][] = []; for (let i = 0; i < values.length; i += size) { batches.push(values.slice(i, i + size)); } return batches; } export function mergeRenewalSummary( a: RenewalSummary, b: RenewalSummary, ): RenewalSummary { return { ok: a.ok + b.ok, renewed: a.renewed + b.renewed, rejected: a.rejected + b.rejected, outputBytes: a.outputBytes + b.outputBytes, }; } export function sameRenewalSummary( a: RenewalSummary, b: RenewalSummary, ): boolean { return a.ok === b.ok && a.renewed === b.renewed && a.rejected === b.rejected && a.outputBytes === b.outputBytes; } function clampInt( value: unknown, fallback: number, min: number, max: number, ): number { const numberValue = Number(value); if (!Number.isFinite(numberValue)) return fallback; const integer = Math.floor(numberValue); if (integer < min) return min; if (integer > max) return max; return integer; } function isRecord(value: unknown): value is Record { return typeof value === "object" && value !== null && !Array.isArray(value); } function bytesToBase64Url(bytes: Uint8Array): string { let binary = ""; for (let i = 0; i < bytes.length; i++) { binary += String.fromCharCode(bytes[i]!); } const base64 = btoa(binary); return base64.replace(/\+/g, "-").replace(/\//g, "_").replace(/=+$/g, ""); } function base64UrlToBytes(base64url: string): Uint8Array { const normalized = base64url.replace(/-/g, "+").replace(/_/g, "/"); const pad = (4 - (normalized.length % 4)) % 4; const padded = normalized + "=".repeat(pad); const binary = atob(padded); const out = new Uint8Array(binary.length); for (let i = 0; i < binary.length; i++) { out[i] = binary.charCodeAt(i); } return out; } function safeParseJson(raw: string): unknown | null { try { return JSON.parse(raw) as unknown; } catch { return null; } } function fixedTimeEqual(a: Uint8Array, b: Uint8Array): boolean { if (a.length !== b.length) return false; let diff = 0; for (let i = 0; i < a.length; i++) { diff |= a[i]! ^ b[i]!; } return diff === 0; } function parseClaims(value: unknown): JwtClaims | null { if (!isRecord(value)) return null; const { sub, sid, scope, iat, exp, renewUntil } = value; if (typeof sub !== "string" || sub.length === 0) return null; if (typeof sid !== "string" || sid.length === 0) return null; if (!Array.isArray(scope) || scope.length === 0) return null; if (!scope.every((item) => typeof item === "string" && item.length > 0)) { return null; } if ( !Number.isInteger(iat) || !Number.isInteger(exp) || !Number.isInteger(renewUntil) ) { return null; } if (exp <= iat) return null; if (renewUntil < exp) return null; return { sub, sid, scope: [...scope], iat, exp, renewUntil }; } async function getHmacKey(secret: string): Promise { let keyPromise = keyCache.get(secret); if (!keyPromise) { keyPromise = crypto.subtle.importKey( "raw", encoder.encode(secret), { name: "HMAC", hash: "SHA-256" }, false, ["sign"], ); keyCache.set(secret, keyPromise); } return keyPromise; } async function signInput( signingInput: string, secret: string, ): Promise { const key = await getHmacKey(secret); const signature = await crypto.subtle.sign( "HMAC", key, encoder.encode(signingInput), ); return bytesToBase64Url(new Uint8Array(signature)); } async function verifyToken( token: string, secret: string, ): Promise { try { const parts = token.split("."); if (parts.length !== 3) return null; const [headerPart, payloadPart, signaturePart] = parts; if (!headerPart || !payloadPart || !signaturePart) return null; const headerRaw = safeParseJson( decoder.decode(base64UrlToBytes(headerPart)), ); if (!isRecord(headerRaw)) return null; if (headerRaw.alg !== "HS256") return null; if (headerRaw.typ !== "JWT") return null; const expectedSignature = await signInput( `${headerPart}.${payloadPart}`, secret, ); const expectedBytes = base64UrlToBytes(expectedSignature); const providedBytes = base64UrlToBytes(signaturePart); if (!fixedTimeEqual(expectedBytes, providedBytes)) return null; const payloadRaw = safeParseJson( decoder.decode(base64UrlToBytes(payloadPart)), ); return parseClaims(payloadRaw); } catch { return null; } } function parseRevalidateRequest(rawRequest: string): RevalidateRequest | null { const parsed = safeParseJson(rawRequest); if (!isRecord(parsed)) return null; if (typeof parsed.token !== "string" || parsed.token.length === 0) { return null; } return { token: parsed.token, nowSec: parsed.nowSec as number | undefined, ttlSec: parsed.ttlSec as number | undefined, renewWindowSec: parsed.renewWindowSec as number | undefined, }; } function shouldRenewToken( claims: JwtClaims, nowSec: number, renewWindowSec: number, ): boolean { if (nowSec > claims.renewUntil) return false; return nowSec >= claims.exp - renewWindowSec; } function responseError(reason: string): RevalidateResponse { return { ok: false, renewed: false, reason }; } export async function issueTokenHost( claims: JwtClaims, secret = DEMO_JWT_SECRET, ): Promise { const header: JwtHeader = { alg: "HS256", typ: "JWT" }; const headerPart = bytesToBase64Url(encoder.encode(JSON.stringify(header))); const payloadPart = bytesToBase64Url(encoder.encode(JSON.stringify(claims))); const signingInput = `${headerPart}.${payloadPart}`; const signaturePart = await signInput(signingInput, secret); return `${signingInput}.${signaturePart}`; } export async function revalidateTokenObjectHost( rawRequest: string, secret = DEMO_JWT_SECRET, ): Promise { const request = parseRevalidateRequest(rawRequest); if (!request) { return responseError("payload: expected JSON { token, nowSec? }"); } const nowSec = clampInt( request.nowSec, Math.floor(Date.now() / 1000), 1, 2_147_483_647, ); const ttlSec = clampInt( request.ttlSec, DEFAULT_TTL_SEC, MIN_TTL_SEC, MAX_TTL_SEC, ); const renewWindowSec = clampInt( request.renewWindowSec, DEFAULT_RENEW_WINDOW_SEC, 0, MAX_RENEW_WINDOW_SEC, ); const claims = await verifyToken(request.token, secret); if (!claims) { return responseError("token: invalid signature, claims, or format"); } const renewable = shouldRenewToken(claims, nowSec, renewWindowSec); if (renewable) { const renewedExp = Math.min(nowSec + ttlSec, claims.renewUntil); if (renewedExp > nowSec) { const renewedClaims: JwtClaims = { ...claims, iat: nowSec, exp: renewedExp, }; const renewedToken = await issueTokenHost(renewedClaims, secret); return { ok: true, renewed: true, token: renewedToken, sub: claims.sub, sid: claims.sid, exp: renewedExp, canRenew: nowSec < claims.renewUntil, }; } } if (nowSec > claims.exp) { return responseError("token: expired and outside renewal policy"); } return { ok: true, renewed: false, token: request.token, sub: claims.sub, sid: claims.sid, exp: claims.exp, canRenew: nowSec <= claims.renewUntil, }; } export async function revalidateTokenHost(rawRequest: string): Promise { const response = await revalidateTokenObjectHost(rawRequest); return JSON.stringify(response); } export const revalidateToken = task({ f: revalidateTokenHost, }); function addSummary( totals: RenewalSummary, response: RevalidateResponse, outputBytes: number, ): RenewalSummary { const next: RenewalSummary = { ok: totals.ok, renewed: totals.renewed, rejected: totals.rejected, outputBytes: totals.outputBytes + outputBytes, }; if (!response.ok) { next.rejected += 1; return next; } next.ok += 1; if (response.renewed) next.renewed += 1; return next; } export async function revalidateTokenBatchFastHost( rawRequests: string[], ): Promise { let totals: RenewalSummary = { ok: 0, renewed: 0, rejected: 0, outputBytes: 0, }; for (let i = 0; i < rawRequests.length; i++) { const response = await revalidateTokenObjectHost(rawRequests[i]!); const responseJson = JSON.stringify(response); totals = addSummary(totals, response, responseJson.length); } return totals; } export const revalidateTokenBatchFast = task({ f: revalidateTokenBatchFastHost, }); export function summarizeJsonResponses(rawResponses: string[]): RenewalSummary { const totals: RenewalSummary = { ok: 0, renewed: 0, rejected: 0, outputBytes: 0, }; for (let i = 0; i < rawResponses.length; i++) { const raw = rawResponses[i]!; totals.outputBytes += raw.length; const parsed = safeParseJson(raw); if (!isRecord(parsed) || parsed.ok !== true) { totals.rejected += 1; continue; } totals.ok += 1; if (parsed.renewed === true) totals.renewed += 1; } return totals; } function tamperToken(token: string): string { const chars = token.split(""); const last = chars.length - 1; chars[last] = chars[last] === "a" ? "b" : "a"; return chars.join(""); } export async function buildDemoRevalidateRequests( options: DemoRequestOptions, ): Promise { const count = clampInt(options.count, 1, 1, 5_000_000); const nowSec = clampInt( options.nowSec, Math.floor(Date.now() / 1000), 1, 2_147_483_647, ); const ttlSec = clampInt( options.ttlSec, DEFAULT_TTL_SEC, MIN_TTL_SEC, MAX_TTL_SEC, ); const renewWindowSec = clampInt( options.renewWindowSec, DEFAULT_RENEW_WINDOW_SEC, 0, MAX_RENEW_WINDOW_SEC, ); const invalidPercent = clampInt(options.invalidPercent, 10, 0, 95); const renewable = await issueTokenHost({ sub: "u_demo_renewable", sid: "s_renewable", scope: ["read", "profile"], iat: nowSec - 70, exp: nowSec + Math.max(5, renewWindowSec - 3), renewUntil: nowSec + 900, }); const fresh = await issueTokenHost({ sub: "u_demo_fresh", sid: "s_fresh", scope: ["read"], iat: nowSec - 10, exp: nowSec + 180, renewUntil: nowSec + 900, }); const expiredRenewable = await issueTokenHost({ sub: "u_demo_expired_grace", sid: "s_expired_grace", scope: ["read", "write"], iat: nowSec - 200, exp: nowSec - 4, renewUntil: nowSec + 240, }); const expiredHard = await issueTokenHost({ sub: "u_demo_expired_hard", sid: "s_expired_hard", scope: ["read"], iat: nowSec - 500, exp: nowSec - 30, renewUntil: nowSec - 3, }); const badSignature = tamperToken(fresh); const malformed = "not-a-jwt"; const requests = new Array(count); for (let i = 0; i < count; i++) { const withinInvalid = i % 100 < invalidPercent; let token: string; if (withinInvalid) { token = i % 2 === 0 ? badSignature : malformed; } else { switch (i % 4) { case 0: token = renewable; break; case 1: token = fresh; break; case 2: token = expiredRenewable; break; default: token = expiredHard; } } requests[i] = JSON.stringify({ token, nowSec, ttlSec, renewWindowSec, }); } return requests; } ``` ## Why offload JWT work HMAC verification and key derivation are CPU-bound. On a busy API server handling hundreds of authenticated requests per second, that crypto work competes with route handling on the main thread. Moving it to a pool keeps your event loop responsive -- especially under mixed traffic where some routes are cheap and others hit the auth path hard. --- # Monte Carlo pi URL: https://knittingdocs.netlify.app/examples/maths/monte_pi/ Estimate pi by throwing darts at a circle -- embarrassingly parallel The classic Monte Carlo dartboard: throw random points into a square, count how many land inside the unit circle, and estimate pi. Every point is independent, making this embarrassingly parallel and a clean demonstration of Knitting's map-reduce pattern. ## How it works Throw lots of random points into $$[-1,1]\times[-1,1]$$. A point is "inside" if $$x^2 + y^2 \le 1$$. Because areas scale nicely: $$ \pi \approx 4 \cdot \frac{\text{inside}}{\text{total}} $$ The host divides the total sample count into chunks, dispatches each chunk to a worker via `piChunk(seed, samples)`, and each worker runs a tight inner loop with a fast deterministic RNG (xorshift32). Each chunk returns a tiny summary: `{ inside, samples }`. The host aggregates and prints the final estimate. ## Run Expected output: ``` threads: 6 samples: 50,000,000 chunk: 1,000,000 dispatching 50 chunks... pi ~ 3.14159842 (error: +0.00000577) elapsed: 0.87s ``` > Note: Monte Carlo error shrinks like `1/sqrtN` -- to halve the error, you need 4x more samples. With 50M samples you'll typically land within ~0.001 of the true value. ## Code ```ts import { createPool, isMain } from "knitting"; import { piChunk } from "./montecarlo_pi.ts"; function intArg(name: string, fallback: number) { const idx = process.argv.indexOf(`--${name}`); if (idx !== -1 && idx + 1 < process.argv.length) { const v = Number(process.argv[idx + 1]); if (Number.isFinite(v) && v > 0) return Math.floor(v); } return fallback; } // Tunables (pick any numbers you like) const TOTAL_SAMPLES = intArg("samples", 50_000_000_000); const CHUNK_SAMPLES = intArg("chunk", 10_000_000); const THREADS = intArg("threads", 6); const { call, shutdown } = createPool({ threads: THREADS, inliner: { position: "last", batchSize: 8, }, balancer: "firstIdle", })({ piChunk }); async function main() { const jobCount = Math.ceil(TOTAL_SAMPLES / CHUNK_SAMPLES); const jobs = new Array>( jobCount, ); // Seed base: stable-ish, different each run const seedBase = ((Date.now() | 0) ^ 0x9e3779b9) | 0; // Queue one worker task per chunk. for (let i = 0; i < jobCount; i++) { const remaining = TOTAL_SAMPLES - i * CHUNK_SAMPLES; const samples = remaining >= CHUNK_SAMPLES ? CHUNK_SAMPLES : remaining; // Spread seeds so chunks don’t reuse the same random stream const seed = (seedBase + (i * 0x6d2b79f5)) | 0; jobs[i] = call.piChunk([seed, samples]); } const time = performance.now(); const results = await Promise.all(jobs); const finished = performance.now(); let inside = 0; let total = 0; for (const r of results) { inside += r.inside; total += r.samples; } const pi = (4 * inside) / total; // Quick sanity: expected sampling error scales like ~1/sqrt(N) const approxStdErr = 1 / Math.sqrt(total); console.log("Monte Carlo π estimate"); console.log("threads :", THREADS + 1); console.log("total samples:", total.toLocaleString()); console.log("chunk size :", CHUNK_SAMPLES.toLocaleString()); console.log("pi :", pi); console.log("took :", (finished - time).toFixed(3), " ms"); console.log("rough ±err :", `~${(approxStdErr * 4).toExponential(2)}`); } if (isMain) { main().finally(shutdown); } ``` ```ts import { task } from "knitting"; type ChunkArgs = readonly [seed: number, samples: number]; type ChunkResult = { inside: number; samples: number }; /** * Fast, deterministic RNG: xorshift32. * (Good enough for Monte Carlo demos, and much faster than Math.random in tight loops.) */ function xorshift32(state: number): number { state |= 0; state ^= state << 13; state ^= state >>> 17; state ^= state << 5; return state | 0; } const INV_2_POW_32 = 2.3283064365386963e-10; // 1 / 2^32 export const piChunk = task({ f: ([seed, samples]) => { let s = seed | 0; let inside = 0; for (let i = 0; i < samples; i++) { s = xorshift32(s); const x = ((s >>> 0) * INV_2_POW_32) * 2 - 1; s = xorshift32(s); const y = ((s >>> 0) * INV_2_POW_32) * 2 - 1; const r2 = x * x + y * y; if (r2 <= 1) inside++; } return { inside, samples }; }, }); ``` ## Why this is a good parallel pattern This example captures the core scientific computing pattern: **map** (simulate many independent trials) then **reduce** (combine partial statistics). Each chunk returns a small summary, not raw data -- that's critical for keeping transfer overhead low. The same structure works for numerical integration, uncertainty propagation, risk estimation, statistical physics, and any problem you can phrase as "run many trials and combine results." ## Practical notes **Chunk size matters.** Each task should do enough work to justify dispatch overhead. Start with 100k-5M iterations per chunk depending on your loop cost. Too small and overhead dominates; too big and you lose load balancing. **Seed carefully.** Use one base seed (so runs are reproducible) and derive a different per-chunk seed (so chunks don't share the same random stream). This example does this correctly. **Keep the inner loop tight.** Avoid allocations per iteration. This example uses xorshift32 instead of `Math.random()` for speed. ## Things to try 1. Increase `--samples` to 500M and watch the estimate stabilize. 2. Try different `--chunk` sizes and measure throughput -- there's a sweet spot. 3. Fix the seed and confirm identical output across runs. 4. Modify the worker to also return timing per chunk and plot a histogram. --- # React SSR compression URL: https://knittingdocs.netlify.app/examples/data_transforms/rendering_output/react_ssr_compress/ SSR + Brotli compression on workers -- testing where compression should live. Takes the React SSR example one step further: after rendering HTML, compress it with Brotli. This tests whether it's better to compress on the worker (render + compress in one shot) or on the host (render on worker, compress on main thread). Spoiler: doing both on the worker usually wins because you avoid sending uncompressed HTML back across the thread boundary. ## How it works The host generates JSON payload strings. Both paths render the same user card component, then Brotli-compress the HTML output. The benchmark compares compressed byte totals for parity, then measures throughput. - `bench_react_ssr_compress.ts` -- the benchmark - `render_user_card_compressed.tsx` -- the SSR + compression task - `utils.ts` -- shared payload and compression helpers ## Example input and output Input is the same JSON payload shape as the plain React SSR example. The worker call is: ```ts const compressed = await pool.call.renderUserCardCompressed(payloadJson); ``` Output is a Brotli-compressed `Buffer`, not an HTML string. That is the point of this example: do the expensive render and compression work in one place, then return the smaller payload. Compression uses built-in `node:zlib` -- no extra packages. ## Optional benchmark Expected output: ``` byte parity check: host=42,380 worker=42,380 OK match benchmark avg (ns) min ... max (ns) host 45,600 41,200 ... 58,300 knitting 24,100 21,800 ... 32,400 ``` Brotli is significantly more expensive than `renderToString` alone, so the worker advantage is more pronounced here than in the plain SSR example. ## Code ```tsx import { task } from "knitting"; import { brotliCompressSync } from "node:zlib"; import { renderUserCardHost } from "../react_ssr/render_user_card.tsx"; function compressHtml(html: string) { return brotliCompressSync(html); } export const renderUserCardCompressed = task({ f: (payload: string) => { const html = renderUserCardHost(payload); const compressed = compressHtml(html); return compressed; }, }); ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { renderUserCardHost } from "../react_ssr/render_user_card.tsx"; import { renderUserCardCompressed } from "./render_user_card_compressed.tsx"; import { buildCompressionPayloads, compressHtml, sumCompressedBytes, } from "./utils.ts"; const THREADS = 1; const REQUESTS = 100; async function main() { const payloads = buildCompressionPayloads(REQUESTS); const pool = createPool({ threads: THREADS, inliner: { batchSize: 8, }, })({ renderUserCardCompressed }); let sink = 0; try { runHost(payloads); await runWorkers( pool.call.renderUserCardCompressed, payloads, ); console.log("React SSR + compression benchmark (mitata)"); console.log("workload: parse + normalize + render + brotli"); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("threads:", THREADS, " + main"); boxplot(() => { summary(() => { bench(`host (${REQUESTS.toLocaleString()} req)`, () => { sink = runHost(payloads); }); bench( `knitting (${THREADS} thread(s), ${REQUESTS.toLocaleString()} req)`, async () => { sink = await runWorkers( pool.call.renderUserCardCompressed, payloads, ); }, ); }); }); await run(); console.log("last compressed bytes:", sink.toLocaleString()); } finally { pool.shutdown(); } } function runHost(payloads: string[]): number { let compressedBytes = 0; for (let i = 0; i < payloads.length; i++) { const html = renderUserCardHost(payloads[i]!); compressedBytes += compressHtml(html).byteLength; } return compressedBytes; } async function runWorkers( callRender: (payload: string) => Promise<{ byteLength: number }>, payloads: string[], ): Promise { const jobs: Promise<{ byteLength: number }>[] = []; for (let i = 0; i < payloads.length; i++) { jobs.push(callRender(payloads[i]!)); } const results = await Promise.all(jobs); return sumCompressedBytes(results); } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { brotliCompressSync } from "node:zlib"; import { renderUserCardHost } from "../react_ssr/render_user_card.tsx"; import { buildUserPayloads } from "../react_ssr/utils.ts"; export type CompressionResult = { ms: number; bytes: number; }; export function buildCompressionPayloads(count: number): string[] { return buildUserPayloads(count); } export function compressHtml(html: string) { return brotliCompressSync(html); } export function sumCompressedBytes( chunks: ArrayLike<{ byteLength: number }>, ): number { let total = 0; for (let i = 0; i < chunks.length; i++) { total += chunks[i]!.byteLength; } return total; } export function runHostCompression(payloads: string[]): CompressionResult { const started = performance.now(); let compressedBytes = 0; for (let i = 0; i < payloads.length; i++) { const html = renderUserCardHost(payloads[i]!); compressedBytes += compressHtml(html).byteLength; } return { ms: performance.now() - started, bytes: compressedBytes }; } export function printCompressionMetrics( mode: string, requests: number, ms: number, compressedBytes: number, ): void { const secs = Math.max(1e-9, ms / 1000); const rps = requests / secs; console.log(`${mode} took : ${ms.toFixed(2)} ms`); console.log(`${mode} throughput : ${rps.toFixed(0)} req/s`); console.log(`${mode} compressed : ${compressedBytes.toLocaleString()}`); } ``` ## Compression placement matters Where you compress affects both throughput and data transfer. If you compress on the worker, you send a small compressed buffer back to the host. If you compress on the host, you first transfer the full uncompressed HTML string across the thread boundary, then compress it. For response compression in a real server, doing render + compress on the worker is almost always the right call. --- # Markdown to HTML URL: https://knittingdocs.netlify.app/examples/data_transforms/rendering_output/markdown_to_html/ Convert markdown to HTML with Brotli compression -- a simple transform pipeline. Converts markdown documents to HTML using `marked`, then compresses the output with Brotli. A straightforward transform pipeline -- parse, render, compress -- that shows how well workers handle chained operations. ## How it works The host generates markdown documents. Both the host and worker paths parse the markdown, render HTML, then Brotli-compress the result. Compressed byte totals are compared for parity before the benchmark runs. Three files: - `run_markdown_to_html.ts` -- a small runnable example that compares host and worker output - `bench_markdown_to_html.ts` -- the optional host-vs-worker benchmark - `utils.ts` -- markdown rendering, compression tasks, and shared helpers ## Run Example markdown in: ```md # Knitting markdown example This example renders markdown on the host and in a worker. ## Checklist - Parse markdown - Render HTML - Compare outputs ``` HTML out: ```html

Knitting markdown example

This example renders markdown on the host and in a worker.

Checklist

... ``` ## Optional benchmark Expected output: ``` byte parity check: host=18,720 worker=18,720 OK match benchmark avg (ns) min ... max (ns) host 32,100 29,400 ... 41,200 knitting 17,800 15,600 ... 24,300 ``` ## Code ```ts import { createPool, isMain } from "knitting"; import { markdownToHtml, markdownToHtmlHost } from "./utils.ts"; function intArg(name: string, fallback: number): number { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const value = Number(process.argv[i + 1]); if (Number.isFinite(value) && value > 0) return Math.floor(value); } return fallback; } const THREADS = intArg("threads", 2); const SAMPLE_MARKDOWN = [ "# Knitting markdown example", "", "This example renders markdown on the host and in a worker.", "", "## Checklist", "", "- Parse markdown", "- Render HTML", "- Compare outputs", "", "```ts", "const status = 'ready';", "```", ].join("\n"); async function main() { const hostHtml = markdownToHtmlHost(SAMPLE_MARKDOWN); const pool = createPool({ threads: THREADS })({ markdownToHtml }); try { const workerHtml = await pool.call.markdownToHtml(SAMPLE_MARKDOWN); console.log("Markdown -> HTML example"); console.log("threads :", THREADS); console.log("same html :", hostHtml === workerHtml); console.log("html length :", workerHtml.length); console.log("html preview :", workerHtml.slice(0, 120), "..."); } finally { pool.shutdown(); } } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { buildMarkdownDocs, markdownToHtmlCompressed, markdownToHtmlCompressedHost, sumChunkBytes, } from "./utils.ts"; const THREADS = 2; const DOCS = 2_000; function runHost(markdowns: string[]): number { let compressedBytes = 0; for (let i = 0; i < markdowns.length; i++) { compressedBytes += markdownToHtmlCompressedHost(markdowns[i]!).byteLength; } return compressedBytes; } async function runWorkers( callRender: (markdown: string) => Promise, markdowns: string[], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < markdowns.length; i++) { jobs.push(callRender(markdowns[i]!)); } const chunks = await Promise.all(jobs); return sumChunkBytes(chunks); } async function main() { const markdowns = buildMarkdownDocs(DOCS); const pool = createPool({ threads: THREADS })({ markdownToHtmlCompressed }); let sink = 0; try { await runWorkers( pool.call.markdownToHtmlCompressed, markdowns, ); console.log("Markdown -> HTML benchmark (mitata)"); console.log("workload: parse + render + brotli"); console.log("docs per iteration:", DOCS.toLocaleString()); console.log("threads:", THREADS); boxplot(() => { summary(() => { bench(`host (${DOCS.toLocaleString()} docs)`, () => { sink = runHost(markdowns); }); bench( `knitting (${THREADS} thread(s), ${DOCS.toLocaleString()} docs)`, async () => { sink = await runWorkers( pool.call.markdownToHtmlCompressed, markdowns, ); }, ); }); }); await run(); console.log("last compressed bytes:", sink.toLocaleString()); } finally { pool.shutdown(); } } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { task } from "knitting"; import { marked } from "marked"; import { brotliCompressSync } from "node:zlib"; marked.setOptions({ gfm: true, }); export function markdownToHtmlHost(markdown: string): string { return marked.parse(markdown) as string; } export const markdownToHtml = task({ f: markdownToHtmlHost, }); export function markdownToHtmlCompressedHost(markdown: string) { const html = markdownToHtmlHost(markdown); return brotliCompressSync(html); } export const markdownToHtmlCompressed = task({ f: markdownToHtmlCompressedHost, }); export const TOPICS = [ "workers", "schema", "compression", "batching", "latency", "rendering", "validation", "throughput", ]; function pick(arr: T[], i: number): T { return arr[i % arr.length]!; } export function makeMarkdown(i: number): string { const topicA = pick(TOPICS, i); const topicB = pick(TOPICS, i + 3); const topicC = pick(TOPICS, i + 5); const id = i.toString(36); return [ `# Job ${id.toUpperCase()}`, "", `This page documents a ${topicA} pipeline for ${topicB}.`, "", "## Checklist", "", `- Parse input payload ${id}`, "- Validate required fields and defaults", `- Render output for ${topicC}`, "", "## Sample code", "", "```ts", `const jobId = \"${id}\";`, 'const status = "ready";', "```", "", `Generated at 2026-01-${String((i % 27) + 1).padStart(2, "0")}.`, ].join("\n"); } export function buildMarkdownDocs(count: number): string[] { const docs = new Array(count); for (let i = 0; i < count; i++) docs[i] = makeMarkdown(i); return docs; } export function sumChunkBytes( chunks: ArrayLike<{ byteLength: number }>, ): number { let total = 0; for (let i = 0; i < chunks.length; i++) { total += chunks[i]!.byteLength; } return total; } ``` ## A clean pipeline example This is the simplest rendering example -- no component tree, no JSX, just string in, string out. It's a good reference if you want to understand the worker pattern without the React SSR complexity. The same approach works for any transform chain: parse input, process it, compress or encode the result. --- # Physics loop URL: https://knittingdocs.netlify.app/examples/maths/physics_loop/ 2D random walk simulation -- branch-heavy physics loop parallelized across workers A physics-style simulation loop: step-by-step state updates with branching and early exits. Unlike the pi example (pure arithmetic), this one simulates a **2D random walk** -- start at the origin, move one unit in a random direction each step, stop when the particle crosses a radius boundary. The work per trial varies (some particles escape early, some don't), which makes chunking important for load balance. ## How it works 1. The host creates a pool and splits the total run count into chunks. 2. Each worker runs many independent particle trials in a tight inner loop: - Initialize `(x, y)` at the origin - Each step: pick a random direction, update position, check escape condition - Track whether the particle escaped and how many steps it took 3. Each chunk returns **compact summary stats**: escaped count, total runs, sum of escape steps. 4. The host aggregates chunk results into final estimates. We measure two things: - **Escape probability:** what fraction of particles reached the boundary within `maxSteps`? - **Mean escape time:** how many steps did it take (for particles that escaped)? ## Run Expected output: ``` threads: 6 runs: 15,000,000 batch: 5,000 steps: 15,000 radius: 100 escape probability: 0.9847 mean escape steps: 7,312 elapsed: 3.41s ``` > Note: With `--radius 100` and `--steps 15000`, most particles escape. Increase the radius or decrease max steps to see the escape probability drop. ## Code ```ts import { createPool, isMain } from "knitting"; import { walkChunk } from "./walk2d.ts"; function intArg(name: string, fallback: number) { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const v = Number(process.argv[i + 1]); if (Number.isFinite(v) && v > 0) return Math.floor(v); } return fallback; } function numArg(name: string, fallback: number) { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const v = Number(process.argv[i + 1]); if (Number.isFinite(v) && v > 0) return v; } return fallback; } // Tunables (pick any data you like) const THREADS = intArg("threads", 4); const TOTAL_RUNS = intArg("runs", 5_000_000); const RUNS_PER_JOB = intArg("batch", 5_000); const MAX_STEPS = intArg("steps", 15_000); const RADIUS = numArg("radius", 100); const { call, shutdown } = createPool({ threads: THREADS, balancer: "firstIdle", // Optional: inliner helps if each job is too small. // inliner: { position: "last", batchSize: 1 }, })({ walkChunk }); async function main() { const jobsCount = Math.ceil(TOTAL_RUNS / RUNS_PER_JOB); const jobs = new Array< Promise< { escaped: number; totalRuns: number; sumSteps: number; sumSteps2: number; } > >(jobsCount); const seedBase = ((Date.now() | 0) ^ 0x9e3779b9) | 0; for (let j = 0; j < jobsCount; j++) { const remaining = TOTAL_RUNS - j * RUNS_PER_JOB; const runs = remaining >= RUNS_PER_JOB ? RUNS_PER_JOB : remaining; // Spread seeds per job so streams differ const seed = (seedBase + (j * 0x6d2b79f5)) | 0; jobs[j] = call.walkChunk([seed, runs, MAX_STEPS, RADIUS]); } const results = await Promise.all(jobs); let escaped = 0; let total = 0; let sumSteps = 0; let sumSteps2 = 0; for (const r of results) { escaped += r.escaped; total += r.totalRuns; sumSteps += r.sumSteps; sumSteps2 += r.sumSteps2; } const pEscape = escaped / total; let mean = NaN; let stdev = NaN; if (escaped > 0) { mean = sumSteps / escaped; const mean2 = sumSteps2 / escaped; const variance = Math.max(0, mean2 - mean * mean); stdev = Math.sqrt(variance); } console.log("Monte Carlo: 2D random-walk first-exit"); console.log("threads :", THREADS); console.log("total runs :", total.toLocaleString()); console.log("radius :", RADIUS); console.log("max steps :", MAX_STEPS.toLocaleString()); console.log("escape prob :", pEscape); console.log("mean steps :", mean); console.log("stdev steps :", stdev); } if (isMain) { main().finally(shutdown); } ``` ```ts import { task } from "knitting"; type Args = readonly [ seed: number, runs: number, maxSteps: number, radius: number, dirPow2?: number, // optional: directions table size = 2^dirPow2 (default 10 => 1024) ]; type Result = { escaped: number; totalRuns: number; sumSteps: number; // sum of steps taken until escape (only for escaped runs) sumSteps2: number; // sum of steps^2 (only for escaped runs) }; // Fast deterministic RNG (xorshift32) function xorshift32(state: number): number { state |= 0; state ^= state << 13; state ^= state >>> 17; state ^= state << 5; return state | 0; } // Precompute direction tables (module-scope = done once per worker) function makeDirs(pow2: number) { const n = 1 << pow2; const xs = new Float64Array(n); const ys = new Float64Array(n); const twoPi = Math.PI * 2; for (let i = 0; i < n; i++) { const a = (i * twoPi) / n; xs[i] = Math.cos(a); ys[i] = Math.sin(a); } return { xs, ys, mask: n - 1 }; } // Default table: 1024 directions const DEFAULT_DIR_POW2 = 10; let DIRS = makeDirs(DEFAULT_DIR_POW2); export const walkChunk = task({ f: ([seed, runs, maxSteps, radius, dirPow2]) => { if (dirPow2 && dirPow2 !== DEFAULT_DIR_POW2) { // Rare path: allow custom resolution if you want DIRS = makeDirs(dirPow2 | 0); } const r2Limit = radius * radius; let s = seed | 0; let escaped = 0; let sumSteps = 0; let sumSteps2 = 0; const xs = DIRS.xs; const ys = DIRS.ys; const mask = DIRS.mask; for (let run = 0; run < runs; run++) { let x = 0.0; let y = 0.0; for (let step = 1; step <= maxSteps; step++) { s = xorshift32(s); const idx = s & mask; x += xs[idx]; y += ys[idx]; const r2 = x * x + y * y; if (r2 >= r2Limit) { escaped++; sumSteps += step; sumSteps2 += step * step; break; } } } return { escaped, totalRuns: runs, sumSteps, sumSteps2 }; }, }); ``` ## What makes this different from the pi example The pi example does the same amount of work per sample (two multiplies and a compare). This simulation has **variable work per trial** -- particles that escape early are cheap, particles that hit the step limit are expensive. That variability means chunking matters more: well-sized chunks smooth out the variance so no single worker gets stuck with all the hard trials. The inner loop is also branch-heavy (escape checks, direction selection), which is closer to real simulation code than a pure arithmetic kernel. ## The science This is a discrete-time approximation of **Brownian motion** / **diffusion**. The estimates converge by the Law of Large Numbers, and Monte Carlo error shrinks like `1/sqrtN`. $$ \hat{p} = \frac{\text{escaped}}{\text{total runs}} \qquad \hat{\mu} = \frac{\sum \text{steps}}{\text{escaped}} $$ Real applications of this pattern: diffusion/Brownian motion, hitting time problems, Monte Carlo transport (particles through materials), agent-based models, game simulation, and uncertainty propagation. ## Practical notes **Chunk size:** Start with `--batch 5000` to `50000` for heavier loops. Increase if each run is short, decrease if each run is long. **Keep the inner loop tight:** Avoid allocations per step. Precompute direction tables if possible. Use simple numeric types. **Validate invariants:** Check that `0 <= escaped <= totalRuns`, totals add up across chunks, and results are stable under fixed seeds. ## Things to try 1. Increase `--radius` and see how mean escape steps changes. 2. Compare different `--batch` chunk sizes and measure throughput. 3. Add variance reporting and compute a 95% confidence interval. 4. Replace the random walk with a drift term (constant force) and compare escape behavior. --- # Salt hashing URL: https://knittingdocs.netlify.app/examples/data_transforms/validation/salt_hashing/ PBKDF2 password hashing with constant-time verification -- the heaviest per-call workload in the validation set. Derives password hashes with PBKDF2-SHA256 and verifies them with constant-time comparison. This is the heaviest per-call workload in the validation set -- each hash derivation is genuinely CPU-expensive by design (that's the point of PBKDF2). ## How it works 1. Generate a random salt per password. 2. Derive a hash using `PBKDF2-SHA256` via Web Crypto. 3. Store as a compact record: `algorithm$iterations$keyBytes$salt$hash`. 4. Verify login attempts by recomputing and constant-time comparing. The benchmark uses `Uint8Array` payloads to reduce serialization noise in hot loops. Three files: - `salt_knitting.ts` -- runs salting + verification in host and Knitting modes - `utils.ts` -- hashing, verification, and fast-path packet functions - `bench_salt_hashing.ts` -- host-vs-worker benchmark with `mitata` ## Example record shape The stored credential format is intentionally compact: ```txt pbkdf2-sha256$600000$32$$ ``` At a high level the flow is: ```txt password -> derive PBKDF2 hash with random salt -> store compact record login attempt -> derive again with stored salt -> constant-time compare ``` That makes the worker boundary simple: small input, small output, expensive CPU work in the middle. Uses built-in Web Crypto APIs -- no extra crypto package required. ## Run Expected output: ``` -- host mode -- hashed: 100 verified: 100 mismatches: 0 -- knitting mode (2 threads) -- hashed: 100 verified: 100 mismatches: 0 ``` ## Optional benchmark Compares hashing typed-array packets through direct imports (`host`) vs worker task calls (`knitting`). Because PBKDF2 is intentionally slow (high iteration count), this is where workers shine most -- each call does enough real work to easily justify dispatch overhead. ## Code ```ts import { createPool, isMain } from "knitting"; import { decodeHashResultPacket, hashPassword, hashPasswordHost, hashPasswordPacketHost, makeHashPacketForIndex, verifyPassword, verifyPasswordHost, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 2_000; const ITERATIONS = 120_000; const MISMATCH_PERCENT = 5; type Summary = { hashed: number; verified: number; mismatched: number; }; function passwordFor(i: number): string { return `user-${i.toString(36)}-password`; } function expectedPassword(i: number): string { if (i % 100 < MISMATCH_PERCENT) return `wrong-${i.toString(36)}-password`; return passwordFor(i); } async function runHost(): Promise

{ let verified = 0; let mismatched = 0; for (let i = 0; i < REQUESTS; i++) { const password = passwordFor(i); const hashed = await hashPasswordHost({ password, iterations: ITERATIONS }); const checked = await verifyPasswordHost({ password: expectedPassword(i), record: hashed.record, }); if (checked.ok) verified++; else mismatched++; } return { hashed: REQUESTS, verified, mismatched }; } async function runWorkers(): Promise

{ const pool = createPool({ threads: THREADS })({ hashPassword, verifyPassword, }); try { const hashJobs: Promise<{ record: string }>[] = []; for (let i = 0; i < REQUESTS; i++) { hashJobs.push(pool.call.hashPassword({ password: passwordFor(i), iterations: ITERATIONS, })); } const hashes = await Promise.all(hashJobs); const verifyJobs: Promise<{ ok: boolean }>[] = []; for (let i = 0; i < REQUESTS; i++) { verifyJobs.push(pool.call.verifyPassword({ password: expectedPassword(i), record: hashes[i]!.record, })); } const checks = await Promise.all(verifyJobs); let verified = 0; for (let i = 0; i < checks.length; i++) { if (checks[i]!.ok) verified++; } const mismatched = REQUESTS - verified; return { hashed: REQUESTS, verified, mismatched }; } finally { pool.shutdown(); } } function printSummary(mode: string, summary: Summary, ms: number): void { const seconds = Math.max(1e-9, ms / 1000); const ops = REQUESTS / seconds; console.log(mode); console.log("requests :", REQUESTS.toLocaleString()); console.log("iterations :", ITERATIONS.toLocaleString()); console.log("mismatch rate :", `${MISMATCH_PERCENT}%`); console.log("hashed :", summary.hashed.toLocaleString()); console.log("verified :", summary.verified.toLocaleString()); console.log("mismatched :", summary.mismatched.toLocaleString()); console.log("took :", `${ms.toFixed(2)} ms`); console.log("throughput :", `${ops.toFixed(0)} req/s`); } async function printPacketSample() { const packet = makeHashPacketForIndex(7, ITERATIONS, 32, 16); const result = await hashPasswordPacketHost(packet); const decoded = decodeHashResultPacket(result); console.log("packet sample : iterations", decoded.iterations); console.log("salt(base64) :", decoded.saltBase64); console.log("hash(base64) :", decoded.hashBase64); } async function main() { const hostStart = performance.now(); const host = await runHost(); const hostMs = performance.now() - hostStart; const workerStart = performance.now(); const knitting = await runWorkers(); const workerMs = performance.now() - workerStart; const uplift = (hostMs / Math.max(1e-9, workerMs) - 1) * 100; console.log("Password salting + hashing"); console.log(`threads: ${THREADS}`); console.log(""); printSummary("host", host, hostMs); console.log(""); printSummary("knitting", knitting, workerMs); console.log(""); console.log(`uplift: ${uplift.toFixed(1)}%`); await printPacketSample(); } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { buildDemoHashPackets, type HashBatchSummary, hashPasswordPacketBatchFast, hashPasswordPacketBatchFastHost, } from "./utils.ts"; const THREADS = 2; const REQUESTS = 500; const BATCH = 32; const ITERATIONS = 1_200; const KEY_BYTES = 32; const SALT_BYTES = 16; async function main() { const packets = buildDemoHashPackets({ count: REQUESTS, iterations: ITERATIONS, keyBytes: KEY_BYTES, saltBytes: SALT_BYTES, }); const batches = makeBatches(packets, BATCH); const pool = createPool({ threads: THREADS })({ hashPasswordPacketBatchFast, }); let sink = 0; try { const hostCheck = await runHostBatches(batches); const workerCheck = await runWorkerBatches( pool.call.hashPasswordPacketBatchFast, batches, ); if (!same(hostCheck, workerCheck)) { throw new Error("Host and worker hashing summaries differ."); } console.log("Salt hashing benchmark (mitata)"); console.log("workload: PBKDF2-SHA256 on Uint8Array request packets"); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("iterations:", ITERATIONS.toLocaleString()); console.log("batch size:", BATCH); console.log("threads:", THREADS); boxplot(() => { summary(() => { bench( `host (${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runHostBatches(batches); sink = totals.outputBytes ^ totals.digestXor; }, ); bench( `knitting (${THREADS} thread${ THREADS === 1 ? "" : "s" }, ${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runWorkerBatches( pool.call.hashPasswordPacketBatchFast, batches, ); sink = totals.outputBytes ^ totals.digestXor; }, ); }); }); await run(); console.log("last sink:", sink.toLocaleString()); } finally { pool.shutdown(); } } function makeBatches(packets: Uint8Array[], batchSize: number): Uint8Array[][] { const out: Uint8Array[][] = []; for (let i = 0; i < packets.length; i += batchSize) { out.push(packets.slice(i, i + batchSize)); } return out; } function merge(a: HashBatchSummary, b: HashBatchSummary): HashBatchSummary { return { count: a.count + b.count, outputBytes: a.outputBytes + b.outputBytes, digestXor: a.digestXor ^ b.digestXor, }; } async function runHostBatches( batches: Uint8Array[][], ): Promise { let totals: HashBatchSummary = { count: 0, outputBytes: 0, digestXor: 0 }; for (let i = 0; i < batches.length; i++) { totals = merge(totals, await hashPasswordPacketBatchFastHost(batches[i]!)); } return totals; } async function runWorkerBatches( callBatch: (packets: Uint8Array[]) => Promise, batches: Uint8Array[][], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < batches.length; i++) jobs.push(callBatch(batches[i]!)); const results = await Promise.all(jobs); let totals: HashBatchSummary = { count: 0, outputBytes: 0, digestXor: 0 }; for (let i = 0; i < results.length; i++) totals = merge(totals, results[i]!); return totals; } function same(a: HashBatchSummary, b: HashBatchSummary): boolean { return a.count === b.count && a.outputBytes === b.outputBytes && a.digestXor === b.digestXor; } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { task } from "knitting"; const encoder = new TextEncoder(); const decoder = new TextDecoder(); const DEFAULT_ITERATIONS = 120_000; const DEFAULT_KEY_BYTES = 32; const DEFAULT_SALT_BYTES = 16; const MIN_ITERATIONS = 10_000; const MAX_ITERATIONS = 2_000_000; const MIN_KEY_BYTES = 16; const MAX_KEY_BYTES = 64; const MIN_SALT_BYTES = 8; const MAX_SALT_BYTES = 32; export type HashRequest = { password: string; iterations?: number; keyBytes?: number; saltBase64?: string; }; export type HashResponse = { record: string; algorithm: "pbkdf2-sha256"; iterations: number; keyBytes: number; saltBase64: string; hashBase64: string; }; export type VerifyRequest = { password: string; record: string; }; export type VerifyResponse = { ok: boolean; reason?: string; }; export type HashBatchSummary = { count: number; outputBytes: number; digestXor: number; }; export type DemoPacketOptions = { count: number; iterations?: number; keyBytes?: number; saltBytes?: number; }; type ParsedRecord = { algorithm: "pbkdf2-sha256"; iterations: number; keyBytes: number; salt: Uint8Array; hash: Uint8Array; }; function clampInt( value: unknown, fallback: number, min: number, max: number, ): number { const numeric = Number(value); if (!Number.isFinite(numeric)) return fallback; const integer = Math.floor(numeric); if (integer < min) return min; if (integer > max) return max; return integer; } function bytesToBase64(bytes: Uint8Array): string { let raw = ""; for (let i = 0; i < bytes.length; i++) raw += String.fromCharCode(bytes[i]!); return btoa(raw); } function base64ToBytes(value: string): Uint8Array | null { try { const raw = atob(value); const bytes = new Uint8Array(raw.length); for (let i = 0; i < raw.length; i++) bytes[i] = raw.charCodeAt(i); return bytes; } catch { return null; } } function fixedTimeEqual(a: Uint8Array, b: Uint8Array): boolean { if (a.length !== b.length) return false; let diff = 0; for (let i = 0; i < a.length; i++) diff |= a[i]! ^ b[i]!; return diff === 0; } function assertPassword(password: string): string { if (typeof password !== "string" || password.length < 8) { throw new Error("password must be at least 8 characters"); } return password; } function normalizeSalt( saltBase64: string | undefined, saltBytes: number, ): Uint8Array { if (!saltBase64) return crypto.getRandomValues(new Uint8Array(saltBytes)); const salt = base64ToBytes(saltBase64); if (!salt) throw new Error("saltBase64 is not valid base64"); if (salt.length < MIN_SALT_BYTES || salt.length > MAX_SALT_BYTES) { throw new Error( `salt length must be ${MIN_SALT_BYTES}-${MAX_SALT_BYTES} bytes`, ); } return salt; } async function derivePbkdf2( passwordBytes: Uint8Array, saltBytes: Uint8Array, iterations: number, keyBytes: number, ): Promise { const key = await crypto.subtle.importKey( "raw", passwordBytes, "PBKDF2", false, ["deriveBits"], ); const bits = await crypto.subtle.deriveBits( { name: "PBKDF2", hash: "SHA-256", salt: saltBytes, iterations }, key, keyBytes * 8, ); return new Uint8Array(bits); } function makeRecord( iterations: number, keyBytes: number, salt: Uint8Array, hash: Uint8Array, ): string { return [ "pbkdf2-sha256", String(iterations), String(keyBytes), bytesToBase64(salt), bytesToBase64(hash), ].join("$"); } function parseRecord(record: string): ParsedRecord | null { const parts = record.split("$"); if (parts.length !== 5) return null; if (parts[0] !== "pbkdf2-sha256") return null; const iterations = Number(parts[1]); const keyBytes = Number(parts[2]); if (!Number.isInteger(iterations) || !Number.isInteger(keyBytes)) return null; if (iterations < MIN_ITERATIONS || iterations > MAX_ITERATIONS) return null; if (keyBytes < MIN_KEY_BYTES || keyBytes > MAX_KEY_BYTES) return null; const salt = base64ToBytes(parts[3]!); const hash = base64ToBytes(parts[4]!); if (!salt || !hash) return null; if (salt.length < MIN_SALT_BYTES || salt.length > MAX_SALT_BYTES) return null; if (hash.length !== keyBytes) return null; return { algorithm: "pbkdf2-sha256", iterations, keyBytes, salt, hash, }; } export async function hashPasswordHost( request: HashRequest, ): Promise { const password = assertPassword(request.password); const iterations = clampInt( request.iterations, DEFAULT_ITERATIONS, MIN_ITERATIONS, MAX_ITERATIONS, ); const keyBytes = clampInt( request.keyBytes, DEFAULT_KEY_BYTES, MIN_KEY_BYTES, MAX_KEY_BYTES, ); const saltBytes = clampInt( DEFAULT_SALT_BYTES, DEFAULT_SALT_BYTES, MIN_SALT_BYTES, MAX_SALT_BYTES, ); const salt = normalizeSalt(request.saltBase64, saltBytes); const hash = await derivePbkdf2( encoder.encode(password), salt, iterations, keyBytes, ); const saltBase64 = bytesToBase64(salt); const hashBase64 = bytesToBase64(hash); return { record: makeRecord(iterations, keyBytes, salt, hash), algorithm: "pbkdf2-sha256", iterations, keyBytes, saltBase64, hashBase64, }; } export async function verifyPasswordHost( request: VerifyRequest, ): Promise { const password = assertPassword(request.password); const parsed = parseRecord(request.record); if (!parsed) return { ok: false, reason: "record format is invalid" }; const hash = await derivePbkdf2( encoder.encode(password), parsed.salt, parsed.iterations, parsed.keyBytes, ); return fixedTimeEqual(hash, parsed.hash) ? { ok: true } : { ok: false, reason: "password mismatch" }; } export const hashPassword = task({ f: hashPasswordHost, }); export const verifyPassword = task({ f: verifyPasswordHost, }); function writeU16LE(out: Uint8Array, offset: number, value: number): void { out[offset] = value & 255; out[offset + 1] = (value >>> 8) & 255; } function writeU32LE(out: Uint8Array, offset: number, value: number): void { out[offset] = value & 255; out[offset + 1] = (value >>> 8) & 255; out[offset + 2] = (value >>> 16) & 255; out[offset + 3] = (value >>> 24) & 255; } function readU16LE(input: Uint8Array, offset: number): number { return input[offset]! | (input[offset + 1]! << 8); } function readU32LE(input: Uint8Array, offset: number): number { return ( input[offset]! | (input[offset + 1]! << 8) | (input[offset + 2]! << 16) | (input[offset + 3]! << 24) ) >>> 0; } function toBytes(value: unknown): Uint8Array { if (value instanceof Uint8Array) return value; if (value instanceof ArrayBuffer) return new Uint8Array(value); if (ArrayBuffer.isView(value)) { return new Uint8Array(value.buffer, value.byteOffset, value.byteLength); } if (Array.isArray(value)) { const out = new Uint8Array(value.length); for (let i = 0; i < value.length; i++) out[i] = Number(value[i] ?? 0) & 255; return out; } if (typeof value !== "object" || value === null) { throw new Error("packet is not a byte buffer"); } const candidate = value as { length?: unknown; byteLength?: unknown; data?: unknown; [index: number]: unknown; [key: string]: unknown; }; if (Array.isArray(candidate.data)) { const out = new Uint8Array(candidate.data.length); for (let i = 0; i < candidate.data.length; i++) { out[i] = Number(candidate.data[i] ?? 0) & 255; } return out; } const lengthValue = Number(candidate.length); const byteLengthValue = Number(candidate.byteLength); let size = Number.isFinite(lengthValue) ? Math.max(0, Math.floor(lengthValue)) : Number.isFinite(byteLengthValue) ? Math.max(0, Math.floor(byteLengthValue)) : -1; if (size < 0) { let maxIndex = -1; for (const key of Object.keys(candidate)) { if (/^\d+$/.test(key)) maxIndex = Math.max(maxIndex, Number(key)); } if (maxIndex >= 0) size = maxIndex + 1; } if (size < 0) throw new Error("packet has no length"); const out = new Uint8Array(size); for (let i = 0; i < size; i++) { out[i] = Number(candidate[i] ?? 0) & 255; } return out; } // Compact binary payloads are faster than structured objects for hot loops. // Header: u16 passwordLen, u16 saltLen, u32 iterations, u16 keyBytes. export function encodeHashPacket( passwordBytes: Uint8Array, saltBytes: Uint8Array, iterations: number, keyBytes: number, ): Uint8Array { const headerSize = 10; const out = new Uint8Array( headerSize + passwordBytes.length + saltBytes.length, ); writeU16LE(out, 0, passwordBytes.length); writeU16LE(out, 2, saltBytes.length); writeU32LE(out, 4, iterations); writeU16LE(out, 8, keyBytes); out.set(passwordBytes, headerSize); out.set(saltBytes, headerSize + passwordBytes.length); return out; } function decodeHashPacket(packetLike: unknown): { password: Uint8Array; salt: Uint8Array; iterations: number; keyBytes: number; } { const packet = toBytes(packetLike); if (packet.length < 10) throw new Error("packet too small"); const passwordLen = readU16LE(packet, 0); const saltLen = readU16LE(packet, 2); const iterations = readU32LE(packet, 4); const keyBytes = readU16LE(packet, 8); const expected = 10 + passwordLen + saltLen; if (expected !== packet.length) throw new Error("packet size mismatch"); if (passwordLen < 8) throw new Error("password too short"); if (saltLen < MIN_SALT_BYTES || saltLen > MAX_SALT_BYTES) { throw new Error("salt size invalid"); } if (iterations < MIN_ITERATIONS || iterations > MAX_ITERATIONS) { throw new Error("iterations invalid"); } if (keyBytes < MIN_KEY_BYTES || keyBytes > MAX_KEY_BYTES) { throw new Error("key size invalid"); } const password = packet.slice(10, 10 + passwordLen); const salt = packet.slice(10 + passwordLen, expected); return { password, salt, iterations, keyBytes }; } // Result packet: u16 saltLen, u16 hashLen, u32 iterations, then salt + hash. function encodeHashResultPacket( salt: Uint8Array, hash: Uint8Array, iterations: number, ): Uint8Array { const out = new Uint8Array(8 + salt.length + hash.length); writeU16LE(out, 0, salt.length); writeU16LE(out, 2, hash.length); writeU32LE(out, 4, iterations); out.set(salt, 8); out.set(hash, 8 + salt.length); return out; } export function decodeHashResultPacket(packet: Uint8Array): { iterations: number; saltBase64: string; hashBase64: string; } { if (packet.length < 8) throw new Error("result packet too small"); const saltLen = readU16LE(packet, 0); const hashLen = readU16LE(packet, 2); const iterations = readU32LE(packet, 4); const expected = 8 + saltLen + hashLen; if (expected !== packet.length) { throw new Error("result packet size mismatch"); } const salt = packet.slice(8, 8 + saltLen); const hash = packet.slice(8 + saltLen, expected); return { iterations, saltBase64: bytesToBase64(salt), hashBase64: bytesToBase64(hash), }; } export async function hashPasswordPacketHost( packet: Uint8Array, ): Promise { const decoded = decodeHashPacket(packet); const hash = await derivePbkdf2( decoded.password, decoded.salt, decoded.iterations, decoded.keyBytes, ); return encodeHashResultPacket(decoded.salt, hash, decoded.iterations); } export const hashPasswordPacket = task({ f: hashPasswordPacketHost, }); export async function hashPasswordPacketBatchFastHost( packets: Uint8Array[], ): Promise { let outputBytes = 0; let digestXor = 0; for (let i = 0; i < packets.length; i++) { const hashed = await hashPasswordPacketHost(packets[i]!); outputBytes += hashed.length; digestXor ^= hashed[hashed.length - 1] ?? 0; } return { count: packets.length, outputBytes, digestXor }; } export const hashPasswordPacketBatchFast = task( { f: hashPasswordPacketBatchFastHost, }, ); function fillDeterministicSalt(seed: number, bytes: number): Uint8Array { let x = (seed ^ 0x9e3779b9) >>> 0; const out = new Uint8Array(bytes); for (let i = 0; i < bytes; i++) { x ^= x << 13; x ^= x >>> 17; x ^= x << 5; out[i] = x & 255; } return out; } export function makePasswordBytes(i: number): Uint8Array { return encoder.encode(`password-${i.toString(36)}-knitting`); } export function makeHashPacketForIndex( i: number, iterations: number, keyBytes: number, saltBytes: number, ): Uint8Array { const password = makePasswordBytes(i); const salt = fillDeterministicSalt(i + 1, saltBytes); return encodeHashPacket(password, salt, iterations, keyBytes); } export function buildDemoHashPackets(options: DemoPacketOptions): Uint8Array[] { const count = clampInt(options.count, 1, 1, 2_000_000); const iterations = clampInt( options.iterations, DEFAULT_ITERATIONS, MIN_ITERATIONS, MAX_ITERATIONS, ); const keyBytes = clampInt( options.keyBytes, DEFAULT_KEY_BYTES, MIN_KEY_BYTES, MAX_KEY_BYTES, ); const saltBytes = clampInt( options.saltBytes, DEFAULT_SALT_BYTES, MIN_SALT_BYTES, MAX_SALT_BYTES, ); const packets = new Array(count); for (let i = 0; i < count; i++) { packets[i] = makeHashPacketForIndex(i, iterations, keyBytes, saltBytes); } return packets; } export function hashSummaryFromOutputs( outputs: Uint8Array[], ): HashBatchSummary { let outputBytes = 0; let digestXor = 0; for (let i = 0; i < outputs.length; i++) { const out = outputs[i]!; outputBytes += out.length; digestXor ^= out[out.length - 1] ?? 0; } return { count: outputs.length, outputBytes, digestXor }; } export function utf8(bytes: Uint8Array): string { return decoder.decode(bytes); } ``` ## The ideal offload candidate Password hashing is one of those tasks where workers are almost always worth it. The work is CPU-bound, each call is independent, the input and output are small, and -- critically -- you *want* it to be slow (high iterations = harder to brute force). That's a perfect storm for offloading: expensive per-call work that would otherwise block your event loop during login/signup spikes. --- # Hono server routes URL: https://knittingdocs.netlify.app/examples/data_transforms/rendering_output/hono_server/ Build a Hono server with ping, SSR, and JWT routes in Knitting or host-only mode. ## What is this about This example builds a small Hono API with 3 routes: 1. `GET /ping` for health checks. 2. `POST /ssr` to SSR a user card. 3. `POST /jwt` to issue a valid signed JWT for a user. This example is split into three files: 1. `hono_knitting.ts` (the Hono server, plus a Knitting worker pool). 2. `hono_componets_ssr.tsx` (SSR task: parse + defaults + render). 3. `hono_components_jwt.ts` (JWT task: validate + sign + return JSON string). ## Technologies used (and why) - **Hono**: a small, fast routing layer. It keeps the request path minimal so most overhead is in your actual route work. - **`@hono/node-server`**: a thin adapter that runs a Hono `fetch` handler on Node/Bun. - **React SSR (`react-dom/server`)**: renders a tiny HTML page for `/ssr` so you can simulate CPU-heavy server work. - **`hono/jwt`**: signs a JWT (HS256) for `/jwt` so the example includes common auth-like CPU work. - **Knitting (`knitting`)**: runs selected transforms in a worker pool (threads). This is the core “offload expensive work” technique the example demonstrates. The JWT route uses `hono/jwt` and signs with HS256. Set `JWT_SECRET` in production. ## Deno setup (TSX + npm) This example imports TSX and npm packages. For Deno, keep a root `deno.json` like this: ```json { "nodeModulesDir": "auto", "compilerOptions": { "jsx": "react-jsx", "jsxImportSource": "react" } } ``` Without this, TSX files can fail with: ```txt Uncaught SyntaxError: Unexpected token '<' ``` ## Run ## Route quick checks ```bash # Ping curl -s http://localhost:3000/ping # SSR (returns HTML) curl -s http://localhost:3000/ssr \ -H 'content-type: application/json' \ -d '{"name":"Ari","plan":"pro","bio":"Building on Knitting","projects":17}' # JWT (returns JSON string with token) curl -s http://localhost:3000/jwt \ -H 'content-type: application/json' \ -d '{"user":{"id":"u_42","email":"ari@example.com","role":"admin"},"ttlSec":900}' ``` ## Performance notes (how to talk about it correctly) ### What the numbers say (hono_only → hono+knitting) #### Throughput (RPS) | Route | Hono only | Hono + Knitting | Delta | | --- | ---: | ---: | ---: | | `/ping` | `2998` | `5471` | **+82%** | | `/ssr` | `2998` | `5421` | **+81%** | | `/jwt` | `2916` | `5454` | **+87%** | So: roughly **~1.8× throughput** across the board. Offloading the heavy work frees the main thread, so even the “cheap” routes handle more requests per second when Knitting is in play. ### Latency improvements (lower is better) #### PING | Percentile | Hono only | Hono + Knitting | Delta | | --- | ---: | ---: | ---: | | `p50` | `16.32ms` | `8.84ms` | **-46%** | | `p99` | `27.05ms` | `14.52ms` | **-46%** | | `p99.9` | `71.01ms` | `64.59ms` | **-9%** | | `p99.99` | `128.70ms` | `99.44ms` | **-23%** | | `slowest` | `129.25ms` | `99.90ms` | **-23%** | #### SSR | Percentile | Hono only | Hono + Knitting | Delta | | --- | ---: | ---: | ---: | | `p50` | `16.32ms` | `8.71ms` | **-47%** | | `p99` | `27.32ms` | `17.35ms` | **-36%** | | `p99.9` | `71.96ms` | `63.97ms` | **-11%** | | `p99.99` | `128.61ms` | `98.75ms` | **-23%** | | `slowest` | `129.31ms` | `106.73ms` | **-17%** | #### JWT | Percentile | Hono only | Hono + Knitting | Delta | | --- | ---: | ---: | ---: | | `p50` | `16.47ms` | `8.71ms` | **-47%** | | `p99` | `34.49ms` | `15.91ms` | **-54%** | | `p99.9` | `114.95ms` | `59.48ms` | **-48%** | | `p99.99` | `121.47ms` | `101.07ms` | **-17%** | | `slowest` | `161.28ms` | `138.48ms` | **-14%** | Headline: the median and `p99` basically halve, and the JWT tail (`p99.9`) improves dramatically. ## Why this pattern matters - `ping` stays cheap and synchronous. - You can benchmark workers vs host-only with the same route behavior. - In Knitting mode, the JWT task returns **stringified JSON** to reduce structured-clone overhead. - Heavy route logic stays in one shared file, keeping both server entrypoints small. ## Code ```ts import { serve } from "@hono/node-server"; import { createPool } from "knitting"; import { Hono } from "hono"; import { issueJwt } from "./hono_components_jwt.ts"; import { renderSsrPage } from "./hono_componets_ssr.tsx"; const handlers = createPool({ })({ issueJwt, renderSsrPage, }); async function main() { const app = new Hono(); app.get("/ping", (c) => { return c.json({ ok: true, pong: true, runtime: process.release?.name ?? "unknown", ts: new Date().toISOString(), }); }); app.post("/ssr", async (c) => { const html = await handlers.call.renderSsrPage(c.req.arrayBuffer()); return c.html(html); }); app.post("/jwt", async (c) => { const responseJson = await handlers.call.issueJwt(c.req.arrayBuffer()); return c.body(responseJson ?? "Bad request", responseJson ? 200 : 400, { "content-type": "application/json; charset=utf-8", }); }); const server = serve({ fetch: app.fetch, port: 3000 }, (info) => { console.log("GET /ping"); console.log("POST /ssr body: { name?, plan?, bio?, projects? }"); console.log("POST /jwt body: { user: { id, email?, role? }, ttlSec? }"); }); const close = () => { // IMPORTANT TO CLOSE CONNECTION handlers.shutdown(); server.close(); }; process.on("SIGINT", close); process.on("SIGTERM", close); } main().catch((error) => { console.error(error); process.exitCode = 1; }); ``` ```tsx import React from "react"; import { renderToString } from "react-dom/server"; import { task } from "knitting"; import { z } from "zod"; const utf8Decoder = new TextDecoder("utf-8", { fatal: true }); type SsrInput = { name: string; plan: "free" | "pro"; bio: string; projects: number; }; function UserCard({ user }: { user: SsrInput & { updatedAt: string } }) { return ( {`${user.name} - SSR Card`}

{user.name}

{user.bio}

{user.plan.toUpperCase()} plan {user.projects.toLocaleString()} projects Rendered at {user.updatedAt}

); } const ParsedJsonObjectSchema = z.string().transform((raw, ctx) => { try { const parsed = JSON.parse(raw) as unknown; if ( typeof parsed !== "object" || parsed === null || Array.isArray(parsed) ) { ctx.addIssue({ code: z.ZodIssueCode.custom, message: "payload: expected JSON object", }); return z.NEVER; } return parsed as Record; } catch { ctx.addIssue({ code: z.ZodIssueCode.custom, message: "payload: expected JSON object", }); return z.NEVER; } }); const RawSsrInputSchema = z.object({ name: z.preprocess((value) => { if (typeof value !== "string") return undefined; const normalized = value.trim(); return normalized.length > 0 ? normalized : undefined; }, z.string().optional()), plan: z.preprocess( (value) => (value === "free" || value === "pro" ? value : undefined), z.enum(["free", "pro"]).optional(), ), bio: z.preprocess((value) => { if (typeof value !== "string") return undefined; const normalized = value.trim(); return normalized.length > 0 ? normalized : undefined; }, z.string().optional()), projects: z.preprocess((value) => { const numberValue = Number(value); if (!Number.isFinite(numberValue)) return undefined; return Math.max(0, Math.min(100_000, Math.floor(numberValue))); }, z.number().int().optional()), }); const SsrInputSchema = RawSsrInputSchema.transform( (value): SsrInput => ({ name: value.name ?? "Anonymous", plan: value.plan ?? "free", bio: value.bio ?? "No bio yet.", projects: value.projects ?? 0, }), ); export function renderSsrPageHost(rawPayload: ArrayBuffer): string { let decodedPayload = ""; try { decodedPayload = utf8Decoder.decode(rawPayload); } catch { decodedPayload = ""; } const parsed = ParsedJsonObjectSchema.safeParse(decodedPayload); const user: SsrInput = SsrInputSchema.parse( parsed.success ? parsed.data : {}, ); const html = renderToString( , ); return `${html}`; } export const renderSsrPage = task({ f: renderSsrPageHost, }); ``` ```ts import { sign } from "hono/jwt"; import { task } from "knitting"; import { z } from "zod"; const utf8Decoder = new TextDecoder("utf-8", { fatal: true }); const ParsedJsonObjectSchema = z.string().transform((raw, ctx) => { try { const parsed = JSON.parse(raw) as unknown; if ( typeof parsed !== "object" || parsed === null || Array.isArray(parsed) ) { ctx.addIssue({ code: z.ZodIssueCode.custom, message: "payload: expected JSON object", }); return z.NEVER; } return parsed; } catch { ctx.addIssue({ code: z.ZodIssueCode.custom, message: "payload: expected JSON object", }); return z.NEVER; } }); const JwtUserSchema = z.object({ id: z.string().min(1), email: z.string().email().optional(), role: z.string().min(1).optional(), }); const TtlSecSchema = z.preprocess((value) => { const n = Number(value); if (!Number.isFinite(n)) return 900; return Math.max(30, Math.min(86_400, Math.floor(n))); }, z.number().int()); const JwtPayloadSchema = z.object({ user: JwtUserSchema, ttlSec: TtlSecSchema.optional().default(900), }); export async function issueJwtHost( rawPayload: ArrayBuffer, ): Promise { let decodedPayload: string; try { decodedPayload = utf8Decoder.decode(rawPayload); } catch { return null; } const parsedResult = ParsedJsonObjectSchema.safeParse(decodedPayload); if (!parsedResult.success) { return null; } const payloadResult = JwtPayloadSchema.safeParse(parsedResult.data); if (!payloadResult.success) { return null; } const { user, ttlSec } = payloadResult.data; const now = Math.floor(Date.now() / 1000); const exp = now + ttlSec; const token = await sign( { sub: user.id, email: user.email, role: user.role ?? "member", iat: now, exp, }, process.env.secret ?? "hello", ); return JSON.stringify({ ok: true, token, sub: user.id, exp, }); } export const issueJwt = task({ f: issueJwtHost, }); ``` --- # Prompt token budgeting URL: https://knittingdocs.netlify.app/examples/data_transforms/validation/prompt_token_budgeting/ Trim LLM prompts to fit a token budget before sending them to any model. Shapes LLM prompts to fit within a token budget before they hit the API. Uses `tiktoken` for token counting and drops oldest conversation turns first, then trims the query if still over budget. The budgeting logic itself is model-agnostic -- the example uses `gpt-4o-mini` as the tokenizer target, but the pattern works the same way for any model with a token limit. > Note: This example imports `tiktoken` and `openai`, but it never actually calls the OpenAI API. The `openai` package is optional -- these scripts only prepare prompts and count tokens. The pattern applies to any LLM provider. ## How it works 1. The host generates prompt inputs (same system prefix, different conversation history and queries). 2. Each task builds the full prompt and counts tokens with `tiktoken`. 3. If over budget, it drops the oldest turns first, then trims query tokens. 4. The host aggregates token savings and trim counts. Three files: - `run_prompt_token_budget.ts` -- practical prompt-budgeting example for app logic - `bench_prompt_token_budget.ts` -- dedicated `mitata` benchmark measuring budgeting throughput - `token_budget.ts` -- the budgeting logic itself ## Example budget decision Input: ```ts const input = { model: "gpt-4o-mini", systemPrefix: "You are a docs assistant.", history: [ "Need guidance on schema validation.", "Compare workers and batching.", "Keep the answer short.", ], query: "Give a migration plan and one code example.", maxInputTokens: 900, }; ``` Output shape: ```ts { prompt: "...final prompt string...", rawInputTokens: 1120, inputTokens: 884, trimmedTurns: 1, queryWasTrimmed: false, } ``` The useful part is not just the final `prompt`; you also get the bookkeeping needed to explain why a prompt was trimmed and by how much. `openai` is optional. These examples only prepare prompts and token budgets. `mitata` is only needed for the benchmark script. ## Run Expected output: ``` mode: knitting (2 threads) requests: 2000 budget: 900 tokens (gpt-4o-mini) trimmed: 1,247 / 2,000 requests (62.4%) avg tokens saved: 312 per trimmed request total tokens saved: 389,064 elapsed: 1.82s ``` ## Optional benchmark Compares budgeting throughput on the host vs through workers. Worker tasks return compact totals (token/trim counters), not full prompt strings. > Note: Batch size matters here. Tiny batches mostly measure dispatch overhead; very large batches can increase memory pressure. Start with `32` or `64`, then tune for your hardware. ## Code ```ts import { createPool, isMain } from "knitting"; import { preparePrompt, preparePromptHost, type PromptInput, type PromptPlan, } from "./token_budget.ts"; function intArg(name: string, fallback: number): number { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const value = Number(process.argv[i + 1]); if (Number.isFinite(value)) return Math.floor(value); } return fallback; } function strArg(name: string, fallback: string): string { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { return String(process.argv[i + 1]); } return fallback; } const THREADS = Math.max(1, intArg("threads", 2)); const REQUESTS = Math.max(1, intArg("requests", 20_000)); const MAX_INPUT_TOKENS = Math.max(64, intArg("maxInputTokens", 900)); const MODE = strArg("mode", "knitting"); const MODEL = strArg("model", "gpt-4o-mini"); const SYSTEM_PREFIX = [ "You are a docs assistant.", "Prefer concrete and short answers.", "If data is missing, say it directly.", "Do not invent unsupported behavior.", ].join("\n"); const TOPICS = [ "token budgeting", "prompt caching", "parallel workers", "schema validation", "rendering pipelines", "markdown output", "compression tradeoffs", "latency under load", ]; function pick(arr: T[], i: number): T { return arr[i % arr.length]!; } function makeHistory(i: number): string[] { const turns = 3 + (i % 10); const history = new Array(turns); for (let t = 0; t < turns; t++) { const topic = pick(TOPICS, i + t); history[t] = `Need guidance on ${topic}. Include practical steps and one small code example.`; } return history; } function makeInput(i: number): PromptInput { const topicA = pick(TOPICS, i); const topicB = pick(TOPICS, i + 3); const query = [ `Please compare ${topicA} with ${topicB}.`, "I care about cost per request and response quality.", "Give a short recommendation and a migration path.", ].join(" "); return { model: MODEL, systemPrefix: SYSTEM_PREFIX, history: makeHistory(i), query, maxInputTokens: MAX_INPUT_TOKENS, }; } type Totals = { rawTokens: number; budgetedTokens: number; staticTokens: number; dynamicTokens: number; trimmedRuns: number; queryTrimmedRuns: number; turnsDropped: number; }; function summarize(plans: PromptPlan[]): Totals { let totals: Totals = { rawTokens: 0, budgetedTokens: 0, staticTokens: 0, dynamicTokens: 0, trimmedRuns: 0, queryTrimmedRuns: 0, turnsDropped: 0, }; for (const plan of plans) { totals.rawTokens += plan.rawInputTokens; totals.budgetedTokens += plan.inputTokens; totals.staticTokens += plan.staticTokens; totals.dynamicTokens += plan.dynamicTokens; totals.turnsDropped += plan.trimmedTurns; if (plan.trimmedTurns > 0) totals.trimmedRuns++; if (plan.queryWasTrimmed) totals.queryTrimmedRuns++; } return totals; } function runHost(inputs: PromptInput[]): Totals { const plans = inputs.map((input) => preparePromptHost(input)); return summarize(plans); } async function runWorkers(inputs: PromptInput[]): Promise { const pool = createPool({ threads: THREADS })({ preparePrompt }); try { const jobs: Promise[] = []; for (let i = 0; i < inputs.length; i++) { jobs.push(pool.call.preparePrompt(inputs[i]!)); } const plans = await Promise.all(jobs); return summarize(plans); } finally { pool.shutdown(); } } function percent(saved: number, base: number): string { if (base <= 0) return "0.0%"; return `${((saved / base) * 100).toFixed(1)}%`; } async function main() { const inputs = new Array(REQUESTS); for (let i = 0; i < REQUESTS; i++) inputs[i] = makeInput(i); const started = performance.now(); const totals = MODE === "host" ? runHost(inputs) : await runWorkers(inputs); const finished = performance.now(); const tookMs = finished - started; const secs = Math.max(1e-9, tookMs / 1000); const reqPerSec = REQUESTS / secs; const savedTokens = Math.max(0, totals.rawTokens - totals.budgetedTokens); const cacheableTokensEstimate = totals.staticTokens; console.log("Prompt token budgeting"); console.log("mode :", MODE); console.log("model :", MODEL); console.log("threads :", MODE === "host" ? 0 : THREADS); console.log("requests :", REQUESTS.toLocaleString()); console.log("maxInputTokens :", MAX_INPUT_TOKENS.toLocaleString()); console.log("raw tokens :", totals.rawTokens.toLocaleString()); console.log("budgeted tokens :", totals.budgetedTokens.toLocaleString()); console.log( "saved tokens :", `${savedTokens.toLocaleString()} (${ percent(savedTokens, totals.rawTokens) })`, ); console.log("trimmed runs :", totals.trimmedRuns.toLocaleString()); console.log("query trimmed runs:", totals.queryTrimmedRuns.toLocaleString()); console.log("turns dropped :", totals.turnsDropped.toLocaleString()); console.log("cacheable estimate:", cacheableTokensEstimate.toLocaleString()); console.log("took :", tookMs.toFixed(2), "ms"); console.log("throughput :", reqPerSec.toFixed(0), "req/s"); } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { createPool, isMain } from "knitting"; import { bench, boxplot, run, summary } from "mitata"; import { preparePromptBatchFast, preparePromptBatchFastHost, type PromptBudgetSummary, type PromptInput, } from "./token_budget.ts"; function intArg(name: string, fallback: number): number { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const value = Number(process.argv[i + 1]); if (Number.isFinite(value)) return Math.floor(value); } return fallback; } function strArg(name: string, fallback: string): string { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { return String(process.argv[i + 1]); } return fallback; } const THREADS = Math.max(1, intArg("threads", 2)); const REQUESTS = Math.max(1, intArg("requests", 10)); const MAX_INPUT_TOKENS = Math.max(64, intArg("maxInputTokens", 500)); const BATCH = Math.max(1, intArg("batch", 32)); const MODEL = strArg("model", "gpt-4o-mini"); const SYSTEM_PREFIX = [ "You are a docs assistant.", "Prefer concrete and short answers.", "If data is missing, say it directly.", "Do not invent unsupported behavior.", ].join("\n"); const TOPICS = [ "token budgeting", "prompt caching", "parallel workers", "schema validation", "rendering pipelines", "markdown output", "compression tradeoffs", "latency under load", ]; function pick(arr: T[], i: number): T { return arr[i % arr.length]!; } function makeHistory(i: number): string[] { const turns = 3 + (i % 10); const history = new Array(turns); for (let t = 0; t < turns; t++) { const topic = pick(TOPICS, i + t); history[t] = `Need guidance on ${topic}. Include practical steps and one small code example.`; } return history; } function makeInput(i: number): PromptInput { const topicA = pick(TOPICS, i); const topicB = pick(TOPICS, i + 3); const query = [ `Please compare ${topicA} with ${topicB}.`, "I care about cost per request and response quality.", "Give a short recommendation and a migration path.", ].join(" "); return { model: MODEL, systemPrefix: SYSTEM_PREFIX, history: makeHistory(i), query, maxInputTokens: MAX_INPUT_TOKENS, }; } function makeBatches(values: T[], batchSize: number): T[][] { const batches: T[][] = []; for (let i = 0; i < values.length; i += batchSize) { batches.push(values.slice(i, i + batchSize)); } return batches; } function mergeSummary( a: PromptBudgetSummary, b: PromptBudgetSummary, ): PromptBudgetSummary { return { rawTokens: a.rawTokens + b.rawTokens, budgetedTokens: a.budgetedTokens + b.budgetedTokens, staticTokens: a.staticTokens + b.staticTokens, dynamicTokens: a.dynamicTokens + b.dynamicTokens, trimmedRuns: a.trimmedRuns + b.trimmedRuns, queryTrimmedRuns: a.queryTrimmedRuns + b.queryTrimmedRuns, turnsDropped: a.turnsDropped + b.turnsDropped, }; } function runHostBatches(inputBatches: PromptInput[][]): PromptBudgetSummary { let totals: PromptBudgetSummary = { rawTokens: 0, budgetedTokens: 0, staticTokens: 0, dynamicTokens: 0, trimmedRuns: 0, queryTrimmedRuns: 0, turnsDropped: 0, }; for (let i = 0; i < inputBatches.length; i++) { totals = mergeSummary(totals, preparePromptBatchFastHost(inputBatches[i]!)); } return totals; } async function runWorkerBatches( callBatch: (inputs: PromptInput[]) => Promise, inputBatches: PromptInput[][], ): Promise { const jobs: Promise[] = []; for (let i = 0; i < inputBatches.length; i++) { jobs.push(callBatch(inputBatches[i]!)); } const results = await Promise.all(jobs); let totals: PromptBudgetSummary = { rawTokens: 0, budgetedTokens: 0, staticTokens: 0, dynamicTokens: 0, trimmedRuns: 0, queryTrimmedRuns: 0, turnsDropped: 0, }; for (let i = 0; i < results.length; i++) { totals = mergeSummary(totals, results[i]!); } return totals; } function sameSummary(a: PromptBudgetSummary, b: PromptBudgetSummary): boolean { return a.rawTokens === b.rawTokens && a.budgetedTokens === b.budgetedTokens && a.staticTokens === b.staticTokens && a.dynamicTokens === b.dynamicTokens && a.trimmedRuns === b.trimmedRuns && a.queryTrimmedRuns === b.queryTrimmedRuns && a.turnsDropped === b.turnsDropped; } async function main() { const inputs = new Array(REQUESTS); for (let i = 0; i < REQUESTS; i++) { inputs[i] = makeInput(i); } const inputBatches = makeBatches(inputs, BATCH); const pool = createPool({ threads: THREADS - 1, inliner: { batchSize: 8, }, })({ preparePromptBatchFast }); let sink = 0; try { const hostCheck = runHostBatches(inputBatches); const workerCheck = await runWorkerBatches( pool.call.preparePromptBatchFast, inputBatches, ); if (!sameSummary(hostCheck, workerCheck)) { throw new Error("Host and worker prompt-budget totals differ."); } console.log("Prompt token budgeting benchmark (mitata)"); console.log("workload: build prompt + tokenize + trim to budget"); console.log("model:", MODEL); console.log("requests per iteration:", REQUESTS.toLocaleString()); console.log("max input tokens:", MAX_INPUT_TOKENS.toLocaleString()); console.log("batch size:", BATCH); console.log("threads:", THREADS); boxplot(() => { summary(() => { bench(`host (${REQUESTS.toLocaleString()} req, batch ${BATCH})`, () => { const totals = runHostBatches(inputBatches); sink = totals.budgetedTokens; }); bench( `knitting (${THREADS} thread${ THREADS === 1 ? "" : "s" }, ${REQUESTS.toLocaleString()} req, batch ${BATCH})`, async () => { const totals = await runWorkerBatches( pool.call.preparePromptBatchFast, inputBatches, ); sink = totals.budgetedTokens; }, ); }); }); await run(); console.log("last budgeted tokens:", sink.toLocaleString()); } finally { pool.shutdown(); } } if (isMain) { main().catch((error) => { console.error(error); process.exitCode = 1; }); } ``` ```ts import { task } from "knitting"; import { encoding_for_model } from "tiktoken"; export type PromptInput = { model: string; systemPrefix: string; history: string[]; query: string; maxInputTokens: number; }; export type PromptPlan = { prompt: string; rawInputTokens: number; inputTokens: number; staticTokens: number; dynamicTokens: number; trimmedTurns: number; queryWasTrimmed: boolean; }; export type PromptPlanFast = Omit; export type PromptBudgetSummary = { rawTokens: number; budgetedTokens: number; staticTokens: number; dynamicTokens: number; trimmedRuns: number; queryTrimmedRuns: number; turnsDropped: number; }; const decoder = new TextDecoder(); type Encoder = ReturnType; const MAX_ENCODER_CACHE = 4; const MAX_STATIC_TOKEN_CACHE = 512; const encoderCache = new Map(); const staticTokenCache = new Map(); function normalizeText(value: string): string { return value.replace(/\s+/g, " ").trim(); } function countTokens(enc: Encoder, text: string): number { return enc.encode(text).length; } function touchMapEntry(map: Map, key: string, value: V): void { map.delete(key); map.set(key, value); } function evictOldestEncoderIfNeeded(): void { if (encoderCache.size <= MAX_ENCODER_CACHE) return; const oldest = encoderCache.keys().next().value; if (oldest === undefined) return; const enc = encoderCache.get(oldest); if (enc) enc.free(); encoderCache.delete(oldest); } function evictOldestStaticTokenIfNeeded(): void { if (staticTokenCache.size <= MAX_STATIC_TOKEN_CACHE) return; const oldest = staticTokenCache.keys().next().value; if (oldest !== undefined) staticTokenCache.delete(oldest); } function getEncoder(model: string): Encoder { const cached = encoderCache.get(model); if (cached) { touchMapEntry(encoderCache, model, cached); return cached; } const enc = encoding_for_model(model as never); encoderCache.set(model, enc); evictOldestEncoderIfNeeded(); return enc; } function getStaticTokens( model: string, systemPrefix: string, enc: Encoder, ): number { const key = `${model}\x1f${systemPrefix}`; const cached = staticTokenCache.get(key); if (cached !== undefined) { touchMapEntry(staticTokenCache, key, cached); return cached; } const value = countTokens(enc, systemPrefix); staticTokenCache.set(key, value); evictOldestStaticTokenIfNeeded(); return value; } export function clearPromptBudgetCaches(): void { for (const enc of encoderCache.values()) enc.free(); encoderCache.clear(); staticTokenCache.clear(); } function truncateToTokenBudget( enc: Encoder, text: string, maxTokens: number, ): string { if (maxTokens <= 0) return ""; const tokens = enc.encode(text); if (tokens.length <= maxTokens) return text; const clipped = tokens.slice(0, maxTokens); return decoder.decode(enc.decode(clipped)); } function buildPrompt( systemPrefix: string, history: string[], query: string, ): string { const rows: string[] = []; rows.push(systemPrefix.trim()); rows.push(""); rows.push("Conversation context:"); for (let i = 0; i < history.length; i++) { rows.push(`- Turn ${i + 1}: ${history[i]}`); } rows.push(""); rows.push(`User request: ${query}`); return rows.join("\n"); } export function preparePromptHost(input: PromptInput): PromptPlan { const model = input.model; const maxInputTokens = Math.max(64, input.maxInputTokens); const cleanHistory = input.history.map(normalizeText).filter(Boolean); let history = [...cleanHistory]; let query = normalizeText(input.query); const enc = getEncoder(model); const staticTokens = getStaticTokens(model, input.systemPrefix, enc); let prompt = buildPrompt(input.systemPrefix, history, query); const rawInputTokens = countTokens(enc, prompt); let inputTokens = rawInputTokens; let trimmedTurns = 0; let queryWasTrimmed = false; while (inputTokens > maxInputTokens && history.length > 0) { history.shift(); trimmedTurns++; prompt = buildPrompt(input.systemPrefix, history, query); inputTokens = countTokens(enc, prompt); } if (inputTokens > maxInputTokens) { const promptWithoutQuery = buildPrompt(input.systemPrefix, history, ""); const promptWithoutQueryTokens = countTokens(enc, promptWithoutQuery); const remainingBudget = Math.max( 16, maxInputTokens - promptWithoutQueryTokens, ); const clipped = truncateToTokenBudget(enc, query, remainingBudget); queryWasTrimmed = clipped.length < query.length; query = clipped; prompt = buildPrompt(input.systemPrefix, history, query); inputTokens = countTokens(enc, prompt); } return { prompt, rawInputTokens, inputTokens, staticTokens, dynamicTokens: Math.max(0, inputTokens - staticTokens), trimmedTurns, queryWasTrimmed, }; } export const preparePrompt = task({ f: (input) => preparePromptHost(input), }); export function preparePromptFastHost(input: PromptInput): PromptPlanFast { const plan = preparePromptHost(input); return { rawInputTokens: plan.rawInputTokens, inputTokens: plan.inputTokens, staticTokens: plan.staticTokens, dynamicTokens: plan.dynamicTokens, trimmedTurns: plan.trimmedTurns, queryWasTrimmed: plan.queryWasTrimmed, }; } export function preparePromptBatchFastHost( inputs: PromptInput[], ): PromptBudgetSummary { let totals: PromptBudgetSummary = { rawTokens: 0, budgetedTokens: 0, staticTokens: 0, dynamicTokens: 0, trimmedRuns: 0, queryTrimmedRuns: 0, turnsDropped: 0, }; for (let i = 0; i < inputs.length; i++) { const plan = preparePromptFastHost(inputs[i]!); totals.rawTokens += plan.rawInputTokens; totals.budgetedTokens += plan.inputTokens; totals.staticTokens += plan.staticTokens; totals.dynamicTokens += plan.dynamicTokens; totals.turnsDropped += plan.trimmedTurns; if (plan.trimmedTurns > 0) totals.trimmedRuns++; if (plan.queryWasTrimmed) totals.queryTrimmedRuns++; } return totals; } export const preparePromptBatchFast = task({ f: (inputs) => preparePromptBatchFastHost(inputs), }); ``` ## When this matters Token budgeting is a preflight step that runs on every LLM request. If you're handling high-throughput chat traffic -- multiple users, long conversation histories -- the tokenization and trimming work adds up. Offloading it to workers keeps your main thread focused on routing and I/O while budget calculations happen in parallel. It also gives you predictable input sizes, which helps with cost control and latency. --- # TSP (GSA) URL: https://knittingdocs.netlify.app/examples/maths/tsp_gsa/ Parallel heuristic restarts for a gravity-inspired TSP solver The **Traveling Salesman Problem**: given N cities, find the shortest tour that visits each one exactly once and returns to the start. This is NP-hard, so we use heuristics -- specifically, a gravity-inspired population search (GSA) followed by 2-opt local refinement, run as many parallel restarts through Knitting. More restarts = better chance of finding a good solution. This is the most computationally intensive example in the set, and it shows what Knitting looks like on a real optimization workload. ## How it works 1. The host generates (or seeds) a set of cities and a distance matrix. 2. The host launches many **independent solver runs** ("restarts") through the worker pool. 3. Each worker runs a gravity-inspired search to produce a candidate tour, then applies 2-opt to sharpen it. 4. Each worker returns `{ bestLen, bestTour }`. 5. The host picks the global best, validates the tour, and recomputes the length for correctness. The key insight: a single heuristic run can get stuck in a local minimum. Running many independent trials in parallel increases the chance that at least one finds a better region of the search space. This is **embarrassingly parallel** -- restarts are independent, results are small, and the host just picks the best. ## Run Expected output: ``` cities: 64 restarts: 64 threads: 7 pop: 10 iters: 10 dispatching 64 restarts... best tour length: 847.32 tour valid: OK (64 cities, no duplicates, all present) recomputed length: 847.32 OK random baseline: 2,341.07 (solver is 2.76x better than random) elapsed: 2.14s ``` ## Code ```ts import { createPool, isMain } from "knitting"; import { solveTspGsa } from "./tsp_gsa.ts"; function intArg(name: string, fallback: number) { const i = process.argv.indexOf(`--${name}`); if (i !== -1 && i + 1 < process.argv.length) { const v = Number(process.argv[i + 1]); if (Number.isFinite(v) && v > 0) return Math.floor(v); } return fallback; } const THREADS = intArg("threads", 7); const RESTARTS = intArg("restarts", 64); const N = intArg("cities", 64); const POP = intArg("pop", 10); const ITERS = intArg("iters", 10); const worldSeed = intArg("worldSeed", 123456); const seedBase = (Date.now() | 0) ^ 0x9e3779b9; const { call, shutdown } = createPool({ threads: THREADS, balancer: "firstIdle", })({ solveTspGsa }); function validateTour(tour: number[], n: number) { if (tour.length !== n) throw new Error(`tour length ${tour.length} != ${n}`); const seen = new Uint8Array(n); for (const v of tour) { if ((v | 0) !== v) throw new Error(`non-int city id: ${v}`); if (v < 0 || v >= n) throw new Error(`bad city id: ${v}`); if (seen[v]) throw new Error(`duplicate city: ${v}`); seen[v] = 1; } } /* ---------- Host recompute must match worker exactly ---------- */ function xorshift32(s: number): number { s |= 0; s ^= s << 13; s ^= s >>> 17; s ^= s << 5; return s | 0; } const INV_U32 = 2.3283064365386963e-10; // 1 / 2^32 function makeCities(worldSeed: number, n: number): Float64Array { // coords: [x0,y0,x1,y1,...] in [0,1) const coords = new Float64Array(n * 2); let s = worldSeed | 0; for (let i = 0; i < n; i++) { s = xorshift32(s); coords[i * 2 + 0] = (s >>> 0) * INV_U32; s = xorshift32(s); coords[i * 2 + 1] = (s >>> 0) * INV_U32; } return coords; } function makeDistMatrix(coords: Float64Array, n: number): Float32Array { const d = new Float32Array(n * n); for (let i = 0; i < n; i++) { const xi = coords[i * 2 + 0]; const yi = coords[i * 2 + 1]; for (let j = i + 1; j < n; j++) { const dx = xi - coords[j * 2 + 0]; const dy = yi - coords[j * 2 + 1]; const dist = Math.hypot(dx, dy); d[i * n + j] = dist; d[j * n + i] = dist; } } return d; } function recomputeLen(tour: number[], dist: Float32Array, n: number): number { let s = 0; let prev = tour[0]; for (let i = 1; i < n; i++) { const cur = tour[i]; s += dist[prev * n + cur]; prev = cur; } s += dist[prev * n + tour[0]]; return s; } /* ---------- Random pick + random baseline tour ---------- */ function pickRandomIndex(len: number, seed: number): number { // deterministic “random” based on seed const s = xorshift32(seed | 0); return (s >>> 0) % len; } function makeRandomTour(n: number, seed: number): number[] { // Fisher–Yates shuffle const tour = new Array(n); for (let i = 0; i < n; i++) tour[i] = i; let s = seed | 0; for (let i = n - 1; i > 0; i--) { s = xorshift32(s); const j = (s >>> 0) % (i + 1); const tmp = tour[i]; tour[i] = tour[j]; tour[j] = tmp; } return tour; } /* ------------------------------------------------------------ */ async function main() { const jobs: Promise<{ bestLen: number; bestTour: number[] }>[] = []; for (let r = 0; r < RESTARTS; r++) { const runSeed = (seedBase + r * 0x6d2b79f5) | 0; jobs.push(call.solveTspGsa([worldSeed, runSeed, N, POP, ITERS])); } const results = await Promise.all(jobs); if (results.length === 0) throw new Error("no results (unexpected)"); // Find best + worst correctly let best = results[0]; let worst = results[0]; for (const res of results) { if (res.bestLen < best.bestLen) best = res; if (res.bestLen > worst.bestLen) worst = res; } // Build world once on host for verification const coords = makeCities(worldSeed, N); const dist = makeDistMatrix(coords, N); function checkResult( label: string, res: { bestLen: number; bestTour: number[] }, ) { validateTour(res.bestTour, N); const recomputed = recomputeLen(res.bestTour, dist, N); const delta = recomputed - res.bestLen; if (!Number.isFinite(res.bestLen) || res.bestLen < 0) { throw new Error(`${label}: bestLen invalid: ${res.bestLen}`); } if (Math.abs(delta) > 1e-6) { throw new Error( `${label}: length mismatch (delta=${delta}). Host generator != worker generator?`, ); } return { recomputed, delta }; } // Randomly choose ONE run result and verify it too (not just the best) const randIdx = pickRandomIndex(results.length, seedBase ^ 0xA5A5A5A5); const randomRes = results[randIdx]; const bestCheck = checkResult("best", best); const worstCheck = checkResult("worst", worst); const randomCheck = checkResult(`randomRun[#${randIdx}]`, randomRes); // Random baseline tour (not from solver) const randomTour = makeRandomTour(N, seedBase ^ 0xC0FFEE); validateTour(randomTour, N); const randomLen = recomputeLen(randomTour, dist, N); console.log("TSP via gravity (GSA) + 2-opt"); console.log("threads :", THREADS); console.log("restarts :", RESTARTS); console.log("cities :", N); console.log("pop :", POP); console.log("iters :", ITERS); console.log("worldSeed :", worldSeed); console.log("---"); console.log("bestLen :", best.bestLen); console.log("worstLen :", worst.bestLen); console.log(`randomRunLen :`, randomRes.bestLen, `(picked index ${randIdx})`); console.log("randomTourLen:", randomLen); console.log("---"); console.log("best delta :", bestCheck.delta); console.log("worst delta :", worstCheck.delta); console.log("random delta :", randomCheck.delta); console.log( "tour head :", best.bestTour.slice(0, Math.min(16, best.bestTour.length)).join(", "), "...", ); } if (isMain) { main().finally(shutdown); } ``` ```ts import { task } from "knitting"; type Args = readonly [ worldSeed: number, // generates the same city map for all runs runSeed: number, // controls the optimizer randomness nCities: number, popSize: number, iters: number, ]; type Result = { bestLen: number; bestTour: number[]; }; function xorshift32(s: number): number { s |= 0; s ^= s << 13; s ^= s >>> 17; s ^= s << 5; return s | 0; } const INV_U32 = 2.3283064365386963e-10; // 1 / 2^32 function rand01(stateRef: { s: number }): number { stateRef.s = xorshift32(stateRef.s); return (stateRef.s >>> 0) * INV_U32; } function makeCities(worldSeed: number, n: number): Float64Array { // coords: [x0,y0,x1,y1,...] in [0,1) const coords = new Float64Array(n * 2); const st = { s: worldSeed | 0 }; for (let i = 0; i < n; i++) { coords[i * 2 + 0] = rand01(st); coords[i * 2 + 1] = rand01(st); } return coords; } function makeDistMatrix(coords: Float64Array, n: number): Float32Array { const d = new Float32Array(n * n); for (let i = 0; i < n; i++) { const xi = coords[i * 2 + 0]; const yi = coords[i * 2 + 1]; for (let j = i + 1; j < n; j++) { const dx = xi - coords[j * 2 + 0]; const dy = yi - coords[j * 2 + 1]; const dist = Math.hypot(dx, dy); d[i * n + j] = dist; d[j * n + i] = dist; } } return d; } function tourLen(dist: Float32Array, n: number, tour: Int32Array): number { let sum = 0; let prev = tour[0]; for (let i = 1; i < n; i++) { const cur = tour[i]; sum += dist[prev * n + cur]; prev = cur; } sum += dist[prev * n + tour[0]]; return sum; } function decodeKeysToTour( keys: Float64Array, n: number, scratchIdx: number[], outTour: Int32Array, ) { // scratchIdx contains 0..n-1 and is reused scratchIdx.sort((a, b) => keys[a] - keys[b]); for (let i = 0; i < n; i++) outTour[i] = scratchIdx[i]; } const eps = 1e-12; function twoOpt(dist: Float32Array, n: number, tour: Int32Array): number { let best = tourLen(dist, n, tour); while (true) { let improved = false; outer: for (let i = 0; i < n - 1; i++) { for (let k = i + 2; k < n; k++) { const a = tour[i]; const b = tour[(i + 1) % n]; const c = tour[k]; const d = tour[(k + 1) % n]; const before = dist[a * n + b] + dist[c * n + d]; const after = dist[a * n + c] + dist[b * n + d]; if (after + eps < before) { // reverse segment (i+1..k) for (let l = i + 1, r = k; l < r; l++, r--) { const tmp = tour[l]; tour[l] = tour[r]; tour[r] = tmp; } // delta update is valid because we restart scanning immediately best += after - before; improved = true; break outer; } } } if (!improved) break; } // Safety: compute the true length once (guaranteed non-negative if dist is) return tourLen(dist, n, tour); } export const solveTspGsa = task({ f: ([worldSeed, runSeed, nCities, popSize, iters]) => { const n = nCities | 0; const pop = popSize | 0; const T = iters | 0; const coords = makeCities(worldSeed | 0, n); const dist = makeDistMatrix(coords, n); // Agent states const X = new Float64Array(pop * n); const V = new Float64Array(pop * n); const fit = new Float64Array(pop); const mass = new Float64Array(pop); const st = { s: runSeed | 0 }; // Init positions and velocities for (let i = 0; i < pop * n; i++) { X[i] = rand01(st); // [0,1) V[i] = (rand01(st) - 0.5) * 0.1; // small initial velocity } const scratchIdx: number[] = new Array(n); for (let i = 0; i < n; i++) scratchIdx[i] = i; const tmpTour = new Int32Array(n); const bestTour = new Int32Array(n); let bestLen = Infinity; // Helpers const idxPop: number[] = new Array(pop); for (let i = 0; i < pop; i++) idxPop[i] = i; const eps = 1e-9; const G0 = 100.0; const alpha = 20.0; // Main loop for (let t = 0; t < T; t++) { // Evaluate fitness (tour length) for (let i = 0; i < pop; i++) { const base = i * n; decodeKeysToTour(X.subarray(base, base + n), n, scratchIdx, tmpTour); const L = tourLen(dist, n, tmpTour); fit[i] = L; if (L < bestLen) { bestLen = L; bestTour.set(tmpTour); } } // Sort agents by fitness (ascending) idxPop.sort((a, b) => fit[a] - fit[b]); const bestF = fit[idxPop[0]]; const worstF = fit[idxPop[pop - 1]]; const denom = Math.max(eps, worstF - bestF); // Mass for minimization: better fitness => larger mass let sumM = 0; for (let r = 0; r < pop; r++) { const i = idxPop[r]; const m = (worstF - fit[i]) / denom; mass[i] = m; sumM += m; } const invSumM = 1 / Math.max(eps, sumM); for (let i = 0; i < pop; i++) mass[i] *= invSumM; // K-best shrinks over time const K = Math.max(2, (pop * (1 - t / T)) | 0); const G = G0 * Math.exp(-alpha * (t / T)); // Update each agent via gravitational attraction for (let ii = 0; ii < pop; ii++) { const i = idxPop[ii]; const Mi = Math.max(eps, mass[i]); const baseI = i * n; for (let d = 0; d < n; d++) { let Fi = 0; // Pull from top-K agents for (let kk = 0; kk < K; kk++) { const j = idxPop[kk]; if (j === i) continue; // Distance between agent vectors (cheap L2) const baseJ = j * n; let r2 = 0; for (let q = 0; q < n; q++) { const diff = X[baseJ + q] - X[baseI + q]; r2 += diff * diff; } const R = Math.sqrt(r2) + eps; const Mj = mass[j]; const rij = X[baseJ + d] - X[baseI + d]; // random factor to avoid lockstep collapse Fi += rand01(st) * G * (Mi * Mj) * (rij / R); } // a = F / Mi const a = Fi / Mi; // velocity + position update const idx = baseI + d; V[idx] = rand01(st) * V[idx] + a; X[idx] = X[idx] + V[idx]; // keep keys in a reasonable range if (X[idx] < -2) X[idx] = -2; else if (X[idx] > 3) X[idx] = 3; } } } // Local refinement: 2-opt on best tour const refined = bestTour.slice() as Int32Array; const refinedLen = twoOpt(dist, n, refined); if (refinedLen < bestLen) bestLen = refinedLen; // Return as plain JS array for safe payload compatibility const out: number[] = new Array(n); for (let i = 0; i < n; i++) out[i] = refined[i]; return { bestLen, bestTour: out }; }, }); ``` ## How the algorithm works **Representation:** TSP needs a permutation of cities. Each agent stores a real-valued vector -- sorting the keys produces the permutation. This lets "continuous" gravity-like motion work on a discrete problem. **Global search (GSA):** Each agent has a mass proportional to its tour quality. Better tours = higher mass. Agents attract each other like gravity, gradually pulling the population toward better solutions. **Local refinement (2-opt):** After global search, pick two edges, reverse a segment if it shortens the tour, repeat until no improvement. Fast and usually provides large gains -- often the difference between "random" and "competitive." ## Correctness checks This example doesn't just trust the output. It validates: 1. **Tour validity** -- length is N, all integers, no duplicates, all cities present. 2. **Length recomputation** -- recomputes on the host using the same distance matrix, catching bugs like delta-update drift or corrupted permutations. 3. **Random baseline** -- compares against a random tour to confirm the solver is actually finding structure, not returning noise. ## CLI knobs - `--cities` -- increases difficulty sharply (N! possible tours) - `--restarts` -- more restarts = better chance of finding a good tour (cheap parallel wins) - `--iters` -- more iterations per run = deeper exploration per restart - `--pop` -- population size per run = more exploration (but more compute) - `--worldSeed` -- fix the city layout for reproducible comparisons Start small: `cities=32`, `restarts=threads*4`, `iters=200`. Increase quality via restarts first -- that's the cheapest way to improve results. ## Real-world analogs TSP shows up in delivery routing, manufacturing toolpath optimization, PCB drilling, robotics path planning, and scheduling problems. The parallel-restart pattern works for any metaheuristic where independent runs are cheap and you just need the best result. ## Things to try 1. Compare `--restarts` vs `--iters`: which improves quality faster per second of compute? 2. Increase `--cities` to 128 and watch how the search space explodes. 3. Lower `--pop` to 2-3 and see how much worse it gets (and how much faster). 4. Change the distance metric to Manhattan distance and compare behavior. --- # Benchmarks URL: https://knittingdocs.netlify.app/benchmarks/introduction/ Overall benchmark overview across Node, Deno, and Bun, plus a Tokio comparison for primitive and small-payload benchmark shapes. This page is the high-level benchmark summary across Node, Deno, and Bun, with a Tokio close-up for the small-payload fast path. For runtime-specific detail, see the dedicated pages for Node, Deno, Bun, and Tokio. These benchmarks quantify communication overhead and scaling behavior in specific benchmark shapes. They are not a direct prediction of end-to-end application speedup. ## What these benchmarks measure - IPC overhead between main and worker threads - End-to-end message latency as batch size increases - Payload sensitivity (primitive, structured, and binary types) - Throughput at growing payload sizes (up to `1 MiB`) - Heavy-task scaling and parallel efficiency under CPU-intensive workloads Interpretation rule: "`x` faster" here means faster for this benchmark setup (payloads, batching, CPU profile), not universally `x` faster for any app. For primitive calls and low binary payloads, Knitting stays in the same microsecond tier as Tokio in these benchmark shapes. Read that as handoff cost: wakeups/signaling plus copying or cloning very small payloads, not a claim that every payload shape matches Tokio. That matters because it shows the shared-memory fast path is not just "good for JavaScript" — it can stay competitive with a Rust baseline when coordination dominates. ## IPC (combined) The IPC combined chart compares Knitting against worker `postMessage`, websocket, and HTTP across runtimes. - At one message, Knitting is typically about `3.5x-6x` faster than worker `postMessage`. - Against websocket, Knitting is usually around `3.5x-15x` faster. - Against HTTP, Knitting is usually around `10x-57x` faster for the same benchmark shape. ## Latency (line) This chart shows how latency changes as the number of messages per iteration increases. - Across runtimes and batch sizes, Knitting is generally around `3x-45x` faster than worker baselines. - The advantage is strongest in small-to-medium batch ranges where message overhead dominates. - At very large batches, absolute latency rises for all approaches, but Knitting still keeps lower overhead. ## Payload Types (combined) These two charts compare type-dependent overhead at count `1` and count `100`. - Count `1`: Knitting remains low-latency even for structured and binary payloads. - Count `100`: batching increases throughput while preserving strong relative performance. - Heavier payload classes (e.g. large objects/arrays, errors) cost more in every runtime, but Knitting keeps the best profile overall. ## Call growth (1 MiB throughput) From the `1048576 B` row (`avg`) in each runtime's `*_call-growth-batch` result (`batch=64`), one-way transfer throughput is: | Runtime | String (GB/s) | Uint8Array (GB/s) | | --- | ---: | ---: | | Node | `1.44` | `7.50` | | Deno | `3.37` | `5.78` | | Bun | `11.86` | `16.21` | Quick takeaways: - Bun is clearly fastest at 1 MiB in this batched benchmark shape, for both text and binary payloads. - Node's `Uint8Array` path improves sharply under batching and overtakes Deno on binary throughput, while Deno keeps the stronger string path. - This complements the IPC/latency charts by showing behavior when payload size, not just call count, grows. - At this size, ceilings are strongly influenced by memory bandwidth and runtime internals, not application logic. ## Heavy Load: Speedup and Efficiency Heavy-load benchmarks run a CPU-intensive prime-number workload and distribute work across extra threads. - Speedup grows steadily as threads are added, reaching roughly `3.5x-3.8x` at `+4` extra threads in these runs. - Efficiency remains strong under contention, staying around `~70-77%` at higher thread counts. - Bun and Deno show slightly stronger scaling than Node in this specific heavy-load scenario. - This does not imply every app gets `3.5x+`; I/O-heavy services often see smaller changes. ## Where to read details - Use this page for cross-runtime trends. - Use `benchmarks/node`, `benchmarks/deno`, `benchmarks/bun`, and `benchmarks/tokio` for raw tables and per-runtime interpretation. ## Run the suite ```bash ./run.sh ``` Results are written into `results/`. ## JSON output ```bash ./run.sh --json ``` JSON output is useful for plotting scripts under `graphs/`. --- # Node.js URL: https://knittingdocs.netlify.app/benchmarks/node/ Node.js benchmark results for Knitting, including IPC latency, worker comparison, call-growth transfer throughput, heavy-load scaling, and payload type costs. This page summarizes Node.js benchmark runs for Knitting on `node 24.12.0 (arm64-darwin)`. ## IPC (Node) This benchmark compares one round-trip between a main thread and workers using different transports. Knitting has the lowest overhead in this setup: - `1` message: Knitting is about `6x` faster than worker `postMessage`, `15x` faster than websocket, and `57x` faster than HTTP. - `25` messages: Knitting is about `4x` faster than worker `postMessage`. - `50` messages: Knitting is about `3.5x` faster than worker `postMessage`. ### Data This stress test computes prime numbers over a large range, then serializes and parses large JSON payloads: ```ts const N = 10_000_000; // search range: [1..N] const CHUNK_SIZE = 250_000; ``` Even under this heavier workload, parallel workers scale well: - `main + 1 extra thread`: `~1.7x` faster than main only. - `main + 2 extra threads`: `~2.3x` faster than main only. - `main + 3 extra threads`: `~3.0x` faster than main only. - `main + 4 extra threads`: `~3.5x` faster than main only. ### Data ## All types in Knitting This benchmark covers primitive, structured, collection, typed-array, error/date/symbol, promise-arg, and static-vs-dynamic allocator paths. Results are reported for count `1` and count `100` to show both per-call latency and batched throughput. Quick takeaways: - In count `100`, primitive-style payloads are usually in the `~15-30 µs` range, while heavier structured/collection payloads can be `~120-300+ µs`. - The static payload path is usually around `2x-4x` faster than dynamic allocator paths (for example: string `~3.4x`, json `~3.0x`, `Uint8Array` `~3.6x`, symbol `~4.1x` at count `100`). Payload sizes (approximate): | Payload | Size | | --- | ---: | | `jsonObj` | `206 B` | | `jsonArr` | `217 B` | | `mapPayload` | `284 B` | | `Uint8Array` | `1024 B` | | `Int32Array` | `1024 B` | | `Float64Array` | `1024 B` | | `BigInt64Array` | `1024 B` | | `BigUint64Array` | `1024 B` | | `DataView` | `1024 B` | | `smallU8` | `480 B` | | `largeU8` | `481 B` | ### Data --- # Deno URL: https://knittingdocs.netlify.app/benchmarks/deno/ Deno benchmark results for Knitting, including IPC latency, worker comparison, call-growth transfer throughput, heavy-load scaling, and payload type costs. This page summarizes Deno benchmark runs for Knitting on `deno 2.6.6 (aarch64-apple-darwin)`. ## IPC (Deno) This benchmark compares one round-trip between a main thread and workers using different transports. Knitting keeps the lowest overhead in this setup: - `1` message: Knitting is about `3.5x` faster than worker `postMessage`, `3.6x` faster than websocket, and `10x` faster than HTTP. - `25` messages: Knitting is about `9.5x` faster than worker `postMessage`. - `50` messages: Knitting is about `10.7x` faster than worker `postMessage`. ### Data This stress test computes prime numbers over a large range, then serializes and parses large JSON payloads: ```ts const N = 10_000_000; // search range: [1..N] const CHUNK_SIZE = 250_000; ``` Even under this heavier workload, parallel workers scale well: - `main + 1 extra thread`: `~1.8x` faster than main only. - `main + 2 extra threads`: `~2.5x` faster than main only. - `main + 3 extra threads`: `~3.2x` faster than main only. - `main + 4 extra threads`: `~3.7x` faster than main only. ### Data ## All types in Knitting This benchmark covers primitive, structured, collection, typed-array, error/date/symbol, promise-arg, and static-vs-dynamic allocator paths. Results are reported for count `1` and count `100` to show both per-call latency and batched throughput. Quick takeaways: - In count `100`, primitive-style payloads are usually in the `~20-50 µs` range, while heavier structured/collection payloads can be `~160 µs` to multi-millisecond outliers. - The static payload path is usually around `~1.8x-2.5x` faster than dynamic allocator paths (for example: string `~2.2x`, json `~2.5x`, `Uint8Array` `~1.8x`, symbol `~2.2x` at count `100`). Payload sizes (approximate): | Payload | Size | | --- | ---: | | `jsonObj` | `206 B` | | `jsonArr` | `217 B` | | `mapPayload` | `284 B` | | `Uint8Array` | `1024 B` | | `Int32Array` | `1024 B` | | `Float64Array` | `1024 B` | | `BigInt64Array` | `1024 B` | | `BigUint64Array` | `1024 B` | | `DataView` | `1024 B` | | `smallU8` | `480 B` | | `largeU8` | `481 B` | ### Data --- # Bun URL: https://knittingdocs.netlify.app/benchmarks/bun/ Bun benchmark results for Knitting, including IPC latency, worker comparison, call-growth transfer throughput, heavy-load scaling, and payload type costs. This page summarizes Bun benchmark runs for Knitting on `bun 1.3.6 (arm64-darwin)`. ## IPC (Bun) This benchmark compares one round-trip between a main thread and workers using different transports. Knitting keeps the lowest overhead in this setup: - `1` message: Knitting is about `5.7x` faster than worker `postMessage`, `9.5x` faster than websocket, and `21x` faster than HTTP. - `25` messages: Knitting is about `7.4x` faster than worker `postMessage`. - `50` messages: Knitting is about `7x` faster than worker `postMessage`. ### Data This stress test computes prime numbers over a large range, then serializes and parses large JSON payloads: ```ts const N = 10_000_000; // search range: [1..N] const CHUNK_SIZE = 250_000; ``` Even under this heavier workload, parallel workers scale well: - `main + 1 extra thread`: `~1.8x` faster than main only. - `main + 2 extra threads`: `~2.5x` faster than main only. - `main + 3 extra threads`: `~3.3x` faster than main only. - `main + 4 extra threads`: `~3.8x` faster than main only. ### Data ## All types in Knitting This benchmark covers primitive, structured, collection, typed-array, error/date/symbol, promise-arg, and static-vs-dynamic allocator paths. Results are reported for count `1` and count `100` to show both per-call latency and batched throughput. Quick takeaways: - In count `100`, primitive-style payloads are usually in the `~16-25 µs` range, while heavier structured/collection payloads are often `~60-270 µs` with occasional larger spikes. - The static payload path is usually around `~1.3x-3.1x` faster than dynamic allocator paths (for example: string `~1.8x`, json `~2.1x`, `Uint8Array` `~1.3x`, symbol `~3.1x` at count `100`). Payload sizes (approximate): | Payload | Size | | --- | ---: | | `jsonObj` | `206 B` | | `jsonArr` | `217 B` | | `mapPayload` | `284 B` | | `Uint8Array` | `1024 B` | | `Int32Array` | `1024 B` | | `Float64Array` | `1024 B` | | `BigInt64Array` | `1024 B` | | `BigUint64Array` | `1024 B` | | `DataView` | `1024 B` | | `smallU8` | `480 B` | | `largeU8` | `481 B` | ### Data --- # Tokio URL: https://knittingdocs.netlify.app/benchmarks/tokio/ Tokio benchmark comparison for Knitting, covering scalar latency, 1 MiB payload behavior, byte-size sweeps, an Arc> reference sweep, and the benchmark fairness model. This page compares Tokio against Knitting on Bun, Node.js, and Deno using the same batch-oriented echo benchmark. Benchmark source: [`mimiMonads/knitting-vs-tokio-bench`](https://github.com/mimiMonads/knitting-vs-tokio-bench). ## What this benchmark measures Whole-batch latency for three payload shapes: - `f64` - `String` / large UTF-8 text - `Uint8Array` / raw bytes - separate `Arc ## Fairness and the one intentional asymmetry Two major sources of skew are already handled: - **Dispatch shape is aligned.** Rust fans out via spawned tasks and waits with `join_all(...)`, matching knitting creating all `pool.call.*(...)` promises and awaiting `Promise.all(...)`. - **Runtime width is aligned.** Knitting uses `threads: 1`, and Rust uses `#[tokio::main(worker_threads = 1)]`, so sender fan-out can't spread across a bigger worker pool. - **Round-trip work is aligned.** The default `String` and `Uint8Array` paths pay payload work in both directions on both sides; Tokio explicitly clones on send and clones again on the worker reply so the return path is not a cheaper move-only shortcut. One asymmetry is kept on purpose: **memory management**. ### Allocation model This benchmark measures "total cost of the system as designed", not "transport cost after normalizing allocation away". Large payloads have to be copied or shared somehow, and that choice is part of the cost. For large string and byte payloads: - Rust `String` / `Vec` pays `clone()` (heap allocation + memcpy) in the timed section. - Knitting copies into a preallocated shared-memory region managed by its own allocator-like bookkeeping. Avoiding general-purpose allocation in the hot path is part of what makes knitting interesting, so the benchmark keeps that cost in-bounds rather than hiding it. The `Arc>` sweep is included separately for exactly that reason: it shows the shared-ownership upper bound for tiny payloads without pretending that it is the default fair byte path. ## A rough cost model For the payload-heavy echo cases, treat the benchmark as measuring two different "systems": - **knitting:** shared-buffer copies + allocator-style region management (JS values still get materialized when a worker reads/returns them) - **tokio default:** clone-driven allocation + payload copies on the channel path - **tokio Arc reference:** `Arc::clone` shared ownership for the byte buffer handle The exact low-level behavior depends on payload type and runtime, but the high-level point is stable: knitting is buying speed by replacing repeated general-purpose allocation with preallocated shared-memory management. ## Why knitting can be fast A few concrete things knitting does that matter for this benchmark: - **Fixed pool topology → simpler queues.** The pool knows its workers up front, and each host↔worker lane is effectively single‑producer/single‑consumer. That's cheaper than a fully general multi‑producer channel. - **Low-garbage hot path.** Most transport work happens inside typed-array-backed buffers and reused task objects, reducing allocation churn and GC pressure (and references get cleared quickly after each call settles). - **Two-tier payload path.** Small payloads encode inline in the per-call header slot (roughly ~0.5 KiB per in-flight call, with ~544 bytes usable for inline data); larger payloads spill into the shared payload buffer (SAB/GSAB). - **Shared payload buffer + mini allocator.** Large payloads are copied into a preallocated `SharedArrayBuffer` and carved into 64‑byte‑aligned regions tracked by a small slot table/bitset (more complexity, less `malloc` in the hot path). - **Primitives are "header-only".** Numbers/booleans/null/etc encode directly in header words (no payload buffer at all), keeping contention and copying low. - **Optional "gc at idle boundaries".** When workers have `gc()` available , knitting may trigger a GC before going into longer spin/park waits, nudging collections away from the hot loop. None of this is free: it trades simplicity for careful memory layout, extra bookkeeping, and more "allocator-like" engineering. That trade is exactly what this repo is trying to make visible. --- # Why Knitting URL: https://knittingdocs.netlify.app/extras/why/ Why Knitting exists: move hot JavaScript functions off the main thread without turning them into services. Most apps do not need a new service the first time one function gets expensive. They need a better place for that function to run. A parser gets large. A validator gets complicated. A hash, render, or compression step starts taking more of the event loop than it should. The rest of the app is still fine. The problem is smaller than the architecture we often reach for. Knitting is built for that middle place: keep the code local, but move the work off the main thread. ## The hot-function problem CPU pressure in a JavaScript app rarely spreads evenly. Real codebases often have thousands of functions, but only a few of them create most of the cost. That matters because it changes the size of the fix. You may not have an application problem. You may have a function problem. If the heat is local, the solution should be local too. ## The usual choices When the main thread starts paying for those functions, JavaScript developers usually reach for one of three options. **Do nothing.** It is simple, and for a while it works. But every millisecond a hot function spends on the main thread is a millisecond the server cannot spend on everything else. Cheap routes queue behind expensive ones. Scaling out dilutes the pain, but it does not remove the hot path from the event loop. **Workers.** The shape is closer: move the work somewhere else on the same machine. The rough part is everything around it. The runtime gives you `postMessage`; the rest is yours: request IDs, routing, errors, promises, lifecycle, payload choices. That plumbing is the original reason Knitting exists. **Make it a service.** Sometimes that is the right call. Separate ownership, deploy cadence, and failure domains are all real reasons to cross the network. But for one hot function, owned by the same team, in the same repo, the bill is strange: a network hop, serialization, deployment, monitoring, and one more thing to operate. The function did not ask for a hostname. It asked for somewhere else to run. ## The fourth shape Knitting adds a smaller boundary: > Keep the code local. Move the execution boundary. The function stays in your repo, in your language, in your types — one `import` away from its caller. What changes is *where it runs* and *what it can touch*. You export it, hand it to a pool, and call it like the async function it already was. Underneath, it runs on a real thread or an isolated process. ```ts import { createPool, isMain } from "knitting"; export const hello = (name: string) => "Hello " + name; using pool = createPool({})({ hello }); if (isMain) console.log(await pool.call.hello("World!")); ``` That is the whole boundary: one export, one pool, one call. `hello` still reads like a plain function, but it no longer runs on the main thread. So Knitting is a **function-level execution boundary**. Not a new service, and not just a worker helper — a way to move the expensive part without moving the whole app. Compare what each option costs for the same few hot functions: | | In-process | Hand-rolled workers | Microservice | Knitting | |---|---|---|---|---| | Main thread protected | no | yes | yes | yes | | Code stays in the app | yes | mostly | no | yes | | Call-site ergonomics | function call | protocol you wrote | HTTP client | function call | | Transport cost | none | `postMessage` per call | network + JSON | shared-memory mailboxes | | Isolation available | none | thread only | full, always-on | a dial, per pool | | New deploy unit | no | no | yes | no | Two details make the fourth shape useful. **Transport cost matters.** Workers can feel disappointing when reaching the worker costs more than the work. Knitting uses shared-memory mailboxes instead of the runtime's message queue, so small and medium calls stay practical. The [Architecture](/extras/architecture/) page explains the mechanism, and the [benchmarks](/benchmarks/introduction/) show where the shape wins and where it does not. **Isolation should match the task.** A microservice gives you process isolation whether or not you need it. Knitting lets each pool pick its own boundary: in-process guards, a bootstrap hook, runtime-native permissions, or a real OS-sandboxed process. Trusted math can stay on cheap threads. An untrusted plugin can run behind `bwrap`, and [`importTask`](/guides/defining-tasks/#module-loading) keeps its code off the host entirely. Knitting does not replace distributed systems. If you need cross-machine scale, separate failure domains, or team-level ownership boundaries, you still need the network. Knitting is for the moment before that, when the code belongs in the app but the work no longer belongs on the main thread. > Not every hot path deserves a hostname. ## Where it's going > Note: Everything below is roadmap as of June 2026. None of it is shipped API, and the shape may change or be cut as the design matures. Build against what the [guides](/guides/defining-tasks/) document today. Knitting today is task-call oriented: one request in, one response out. The transport underneath it is more general than that. Mailboxes, payload buffers, and named mappings leave room for other shapes. 01 Nearer term

Channels

Some same-host work is not a task call: progress, long-lived coordination, producer/consumer flows. Today the honest answer is MessagePort. A future channel API would keep that shape on the same shared-memory transport, without pretending every conversation is request/response. 02 Longer term

Cross-language, same-host

The mailbox protocol is not tied to JavaScript. Process workers already open named mappings, which makes the boundary more about memory than a specific runtime. A later version could let JavaScript hand large payloads to another runtime on the same machine without copying through JSON or re-marshalling at every hop. That is why the transport is built with more care than a worker pool strictly needs. Until those APIs ship, though, judge Knitting on what it does now. ## Read next - [Quick Start](/start/quick-start/) -- the ten-line version of everything argued above. - [Architecture](/extras/architecture/) -- the mailbox transport, lane model, and safety layers in detail. - [Process workers](/guides/process-workers/) -- the hard end of the isolation dial: sandboxes and containers. - [Examples](/examples/intro_examples/) -- the hot functions behind this argument, as copyable code. --- # Architecture URL: https://knittingdocs.netlify.app/extras/architecture/ How Knitting's shared-memory transport, mailbox protocol, and lane model work under the hood. Ever wondered what actually happens between `call.myTask(args)` and the result landing back in your hands? This page walks that path. You don't need any of it to *use* Knitting -- the [Quick start](/start/quick-start/) and [Creating pools](/guides/creating-pools/) guides have you covered -- but it's here for when you're curious *why* things are fast, chasing a strange bug, or reaching for the advanced knobs. One caveat up front: this is a **simplified mental model**. The real implementation has many more moving parts -- fast paths, edge cases, platform quirks -- that are deliberately left out so the big picture stays readable. Treat the numbers and steps below as the shape of things, not the full contract. ## High-level picture Knitting has three layers: 1. **API layer** -- `task()`, `createPool()`, `call.*()`, `shutdown()`. This is what your code touches. 2. **Dispatch layer** -- the host handler: lane routing, the optional balancer strategy, and the inliner. This decides *where* a call runs. 3. **Transport layer** -- shared-memory mailboxes, payload buffers, wakeups. This moves data between threads without going through the runtime's message queue. Knitting runs **thread** workers by default. It can also run each worker as a **separate process** (for sandboxing or containers): the transport is the same shared-memory idea; a process worker just reaches the memory through a *named* mapping instead of an inherited handle. See [Process workers](/guides/process-workers/) and [Shared memory](/guides/shared-memory/). ## Transport: shared-memory mailboxes This is the core of Knitting's speed advantage. Instead of using `postMessage` (which serializes data, queues it, and deserializes on the other side), Knitting writes directly to `SharedArrayBuffer` regions visible to both threads. ### The mailbox model Each worker has two independent mailboxes: - **Request mailbox** (host -> worker): the host writes a call header here, the worker reads it. - **Response mailbox** (worker -> host): the worker writes the result header here, the host reads it. Each mailbox has **32 slots** -- the slot index is a 5-bit field, so 2⁵ = 32 slots per direction. A slot is a small fixed-size region that holds: - Task function ID (`Uint16`, supports up to 65,536 tasks per pool) - Payload type tag - Small inline values (numbers, booleans, short strings fit directly in the header) ### Slot ownership via a two-word lock Slot state lives in two 32-bit atomic words: `hostBits` and `workerBits`. They sit on **separate 64-byte cache lines**, so the host and worker never fight over the same line -- the *false sharing* that would otherwise bounce a cache line between cores and collapse throughput. A slot is **busy** when the two words disagree on its bit and **free** when they agree: $$ \text{free} = \sim(H \oplus W)\ \&\ \text{MASK} $$ Publishing work is therefore a single atomic toggle of one bit -- not a message copy. Because each direction has exactly one writer (the host writes requests, the worker writes responses), this is a single-producer/single-consumer queue per direction: no mutex, no critical section, just write the payload, then publish the bit. The reader acquires the bit before it trusts the bytes. The lifecycle of a slot: 1. Host atomically claims a free slot in the request mailbox. 2. Host writes the call header (task ID, payload tag, inline data or payload offset). 3. Host notifies the worker. 4. Worker reads the slot, executes the task. 5. Worker claims a slot in the response mailbox, writes the result. 6. Worker notifies the host. 7. Host reads the result, releases the response slot, and resolves the promise. ### Wakeups When the host writes a request, it needs to wake a potentially parked worker. When a worker writes a response, it needs to wake the host handler. Knitting uses `Atomics.notify` (futex-style wakeups) for this. Workers don't busy-wait by default. The idle cycle is a bounded spin, then a park: 1. **Spin** -- check the mailbox in a tight loop for `spinMicroseconds` (default: scales with lane count). Uses `Atomics.pause` to reduce power draw. 2. **Park** -- call `Atomics.wait` with a timeout of `parkMs`. The thread sleeps until notified or the timeout expires. 3. **Wake** -- `Atomics.notify` from the host breaks the park immediately. This means idle workers consume near-zero CPU while still waking up within microseconds when work arrives. ## Data path: payload buffers Values that don't fit in the mailbox header slot need a separate path. Each worker allocates two `SharedArrayBuffer` regions: - **Request payload buffer** -- for arguments sent host -> worker - **Return payload buffer** -- for results sent worker -> host Sizes are controlled by `payloadInitialBytes` (default 4 MiB) and `payloadMaxByteLength` (default 64 MiB). When growable `SharedArrayBuffer` is available in the runtime, buffers start small and grow on demand. Otherwise, they're allocated at max size upfront. ### Three encoding paths (fast -> slow) | Path | When it's used | Cost | | --- | --- | --- | | **Header-only** | Primitives (`boolean`, `null`, `undefined`, `number`, small `string`, small `bigint`, `Date`) | Near zero. Value fits in the slot header. | | **Static payload** | Small typed arrays, short strings that overflow the header | Low. Copies into a fixed region of the payload buffer. | | **Dynamic payload** | Objects, arrays, large strings, `Error` | Higher. Requires allocation, encoding (similar to JSON serialization), and copying. | Shared-memory types (`SharedArrayBuffer`, `ProcessSharedBuffer`) and the ownership-move `BufferReference` skip all three: the bytes aren't copied, only a small descriptor crosses. The [Performance guide](/guides/performance) has exact tier ratings per type. ## Dispatch: lanes and routing A **lane** is an execution target -- either a worker thread or the inline lane. The total lane count is `threads + (inliner ? 1 : 0)`. When you call `call.myTask(args)`, the host **handler**: 1. Resolves any promise arguments on the host. 2. Picks a lane (using the balancer strategy when there's more than one). 3. Encodes the arguments and writes them to that lane's mailbox (or queues them for the inliner). 4. Returns a promise that resolves when the response arrives. The handler runs on *every* call. *Which* lane it picks is the job of the **balancer** strategy -- and that only matters when there's more than one lane to choose between. With a single worker and no inliner there is nothing to balance, so the balancer is bypassed entirely. ### Balancer strategies These are the values of the `balancer` option on `createPool`: | Strategy | Behavior | | --- | --- | | `roundRobin` (default) | Rotates through lanes in order. Simple, fair, predictable. | | `firstIdle` | Picks the first lane with no in-flight work, falls back to round-robin. | | `randomLane` | Picks a random lane. Good for uneven task durations. | | `firstIdleOrRandom` | First idle lane, else random. Balances fairness with load distribution. | ### Handler backoff The host handler has its own stall-avoidance logic: - `stallFreeLoops` (default 128): how many immediate notify-check loops run before backoff starts. - `maxBackoffMs` (default 10): ceiling for exponential backoff delay. Under sustained high load, the handler stays in tight loops. Under intermittent load, it backs off to avoid burning CPU while idle. ## Inliner: the host as a lane The optional **inliner** adds the host thread itself as an execution lane. Inline tasks skip the entire transport layer -- no encode, no mailbox write, no decode. The task function runs directly on the main thread. Inline execution is deferred to a macro-task boundary (via `MessageChannel`) so the handler can service worker sends/receives first, then the host drains inline work. Key details: - `position: "first" | "last"` -- where the inline lane sits in the balancer's lane order. - `batchSize` -- how many inline tasks run per event-loop tick. - `dispatchThreshold` -- minimum in-flight calls before the inline lane is eligible. - Abort signals on inline tasks use a static toolkit where `hasAborted()` always returns `false` (inline tasks can't be individually cancelled since they share the host thread). See [Inliner guide](/guides/inliner) for when to use it and when to avoid it. ## Task lifecycle (end to end) Once the handler has picked a lane, that lane's **tx-queue** drives the round trip. Here's a single `call.add([1, 2])`: 1. **User code (host):** `pool.call.add([1, 2])` returns a `Promise` immediately -- the input isn't a promise, so nothing is awaited first. 2. **Host tx-queue:** encodes the call header + payload into a free request slot. (`[1, 2]` is a small tuple, so it takes the static-payload path at a known offset.) 3. **Host tx-queue:** toggles `hostBits` to publish the slot. 4. **Worker loop:** had been spinning, then parked on the signal word. The signal word changed, so it wakes. 5. **Worker loop:** decodes the args and runs the task fn -- `([a, b]) => a + b` returns `3`. 6. **Worker loop:** writes the result into a response slot and toggles `workerBits` to publish. 7. **Host:** drains the result from shared memory and toggles its bit to release the slot. 8. **Host:** resolves (or rejects) the `Promise` returned in step 1. If the task throws, step 6 writes an error result instead, and step 8 rejects the promise. ## Pool lifecycle ### Startup (`createPool`) 1. Allocates shared memory regions (mailboxes + payload buffers) for each worker. 2. Spawns `threads` workers -- threads by default, or separate processes when configured. Each worker imports the task module to discover exported `task()` values. 3. Workers enter their idle spin/park loop, waiting for work. 4. If `permission` is set, generates runtime-specific CLI flags and passes them via `workerExecArgv`. 5. Returns the pool -- a typed `{ call, shutdown }` object that is also disposable, so a `using` declaration can close it for you. ### Running - Calls flow through the dispatch layer continuously. - Workers spin briefly after completing work (in case more arrives), then park. - The host handler manages backoff independently. ### Shutdown A `using` pool shuts down on its own when the scope ends. Calling `shutdown()` yourself runs the same teardown -- just earlier, the moment you ask for it. Either way: 1. `shutdown()` signals all workers to stop. 2. If `resolveAfterFinishingAll` is `true`, workers finish all pending promises before exiting. 3. All in-flight `call.*()` promises for abort-aware tasks reject with `"Thread closed"`. 4. Worker threads terminate. ## Abort signal pool Tasks defined with `abortSignal: true` or `abortSignal: { hasAborted: true }` use a shared-memory bitset to track cancellation state. The pool has a fixed capacity (default 258, tunable via `abortSignalCapacity`). When the host calls `.reject()` on an abort-aware promise, it flips a bit in the shared bitset. The worker can poll `toolkit.hasAborted()` to check that bit and bail out early. This is cooperative, not preemptive -- the worker must check. If it doesn't, the host promise still rejects immediately, but the worker task runs to completion in the background. ## Memory layout (per worker) | Region | Default size | Purpose | | --- | --- | --- | | Request mailbox | Fixed (32 slots) | Call headers, host -> worker | | Response mailbox | Fixed (32 slots) | Result headers, worker -> host | | Request payload buffer | 4 MiB initial, 64 MiB max | Argument data | | Return payload buffer | 4 MiB initial, 64 MiB max | Result data | | Abort signal bitset | Scales with `abortSignalCapacity` | Cancellation flags | With default settings and 4 workers, the shared memory footprint is roughly: `4 workers x (2 mailboxes + 2 x 4 MiB payload buffers) ~ 32 MiB initial` Payload buffers grow on demand up to `payloadMaxByteLength` if the runtime supports growable `SharedArrayBuffer`. ## Safety layers Workers run code the host may or may not trust, so isolation is a dial, not a switch. Knitting stacks four independent layers -- cheapest and softest first, costliest and hardest last. They compose: trusted local tasks can stop at Layer 1, while an untrusted plugin can be pushed all the way to a sandboxed process. This is defence in depth; no single layer is assumed sufficient on its own. - **Layer 1 -- in-process guards (always on).** Before any task module loads, the worker neutralizes the most dangerous calls: `process.exit`, `process.kill`, `process.abort`, and `Deno.exit` are redefined to throw, and the raw shared-memory handles are scrubbed from the data object visible to task code. Cheap, but a guardrail against accidental misuse -- not a wall against a hostile co-resident. - **Layer 2 -- bootstrap hook.** `worker.bootstrap` runs a privileged module once per worker, *before* task imports -- the right place to strip env vars, install your own guards, or freeze globals. - **Layer 3 -- runtime permissions (strict by default).** The policy is translated into each runtime's native enforcement (Deno's permission flags, Node's permission model, Bun's equivalents), so the boundary is the runtime's, not a library check task code could bypass. - **Layer 4 -- process + real sandbox.** The only OS-enforced boundary: `worker.runtime: "process"`, optionally launched through `bwrap`, Docker, or `systemd-run` via `processCommandPrefix`. Pair it with `importTask` so the isolated code never loads in the host at all. See [Permissions](/guides/permissions/) and [Process workers](/guides/process-workers/) for the full configuration. ## What Knitting does NOT do Knowing where Knitting stops is as useful as knowing what it does: - **No message passing protocol.** Knitting is task-call oriented. If you need pub/sub or event-style messaging, use `postMessage` / `MessagePort`. - **No preemption.** A long-running task blocks its lane until it finishes. Use `abortSignal` with `hasAborted()` polling for cooperative cancellation. - **Same host only.** Workers can be threads *or* separate processes on one machine, but the shared memory never crosses the network -- there is no cross-machine transport. Pair Knitting with one if you need to scale out. - **No automatic scaling.** Worker count is fixed at pool creation. You choose the parallelism level upfront. - **Browser support is limited.** There is a [browser build](/browser/), but shared memory there needs cross-origin isolation (`COOP` / `COEP` headers) because of Spectre-era constraints. Most Knitting work still happens server-side.