Wasm Query Operators¶
Status: Prototype Last updated: 2026-05-09
Goal: Define a stable ABI for columnar batches and a portable plan artifact so SkeinDB can compile query operators to WebAssembly.
This document defines the v1 ABI and the initial plan subset used by wasm.plan.*.
1) Scope (v1)¶
Supported operator subset: - scan (single base table) - filter (WHERE predicate) - project (SELECT expressions)
Unsupported in v1: - joins, aggregates, group_by/having - order_by, limit, distinct - subqueries, case/cast expressions
2) Columnar batch ABI (skein.wasm.batch.v1)¶
2.1 Layout overview¶
Batches are encoded as a single byte buffer in little-endian order. All offsets are relative to the start of the buffer.
struct BatchHeaderV1 {
u32 magic; // 'S','K','B','1'
u16 version; // 1
u16 flags; // reserved
u32 row_count;
u32 column_count;
u32 columns_offset; // start of ColumnMeta array
}
struct ColumnMetaV1 {
u32 type_tag; // see 2.2
u32 data_offset; // start of column data
u32 data_len; // bytes
u32 nulls_offset; // 0 if no null bitmap
u32 nulls_len; // bytes
u32 aux_offset; // for varlen (offsets) or 0
u32 aux_len;
}
2.2 Type tags¶
Type tags align with SkeinQL literal kinds: - 1: bool (1 byte per row) - 2: i64 (8 bytes) - 3: u64 (8 bytes) - 4: f64 (8 bytes) - 5: str (varlen, UTF-8) - 6: bytes (varlen)
2.3 Null bitmap¶
If nulls_offset is non-zero, it points to a bitmap with 1 bit per row.
Bit=1 indicates a non-null value, bit=0 indicates NULL.
If omitted, all values are non-null.
2.4 Varlen encoding¶
For str and bytes columns:
- aux_offset points to a u32 offsets array of length row_count + 1.
- data_offset points to the concatenated payload bytes.
- The i-th value spans data[offs[i]..offs[i+1]].
3) Operator ABI (v1)¶
Operators are pure batch-to-batch transforms. The module exports:
// Returns (ptr << 32) | len, like skein UDFs.
export fn skein_plan_eval(ptr: u32, len: u32) -> u64
Rules: - The host writes the input batch into module memory at (ptr,len). - The function returns a packed (ptr,len) for the output batch. - Returning len=0 indicates end-of-stream.
Memory management follows the UDF ABI in docs/WASM_UDFS.md.
4) Plan artifact format (skein.wasm.plan.v1)¶
The portable plan artifact is JSON, base64-encoded for transport:
{
"format": "skein.wasm.plan.v1",
"abi": "skein.wasm.batch.v1",
"target": "wasm32-unknown-unknown",
"execution": "generated_filter_project_v1",
"plan": {
"ops": [
{"op": "scan", "table": {"db": "app", "table": "users"}},
{"op": "filter", "predicate": {"op":"gt","a":{"col":"score"},"b":{"param":0}}},
{"op": "project", "projection": [{"expr":{"col":"id"}}, {"expr":{"col":"score"}}]}
]
},
"generated": {
"input_table_columns": [
{"name": "score", "type": {"kind": "u64", "unsigned": true}},
{"name": "id", "type": {"kind": "u64", "unsigned": true}}
],
"param_columns": [
{"name": "$param_0", "type": {"kind": "u64", "unsigned": true}}
],
"output_columns": [
{"name": "id", "type": {"kind": "u64", "unsigned": true}},
{"name": "score", "type": {"kind": "u64", "unsigned": true}}
],
"module_b64": "AGFzbQE..."
}
}
Rules:
- scan must be first and exactly once.
- project must be last and exactly once.
- filter is optional and must appear between scan and project.
- execution is generated_filter_project_v1 when the compiler can lower a fixed-width, non-null u64/bool filter/project plan into an embedded Wasm module.
- execution falls back to host_interpreted_v1 when the plan uses unsupported operators, nullable values, or non-fixed-width types.
- generated is present only for compiled artifacts and records the generated module plus its input/output column metadata.
5) SkeinQL methods¶
wasm.plan.compile¶
Params:
{
"query": {"body": {"select": {"projection": [{"expr": {"col": "id"}}], "from": [{"db": "app", "table": "users"}]}}},
"abi": "skein.wasm.batch.v1",
"target": "wasm32-unknown-unknown"
}
Result:
{
"format": "skein.wasm.plan.v1",
"abi": "skein.wasm.batch.v1",
"artifact_b64": "...",
"target": "wasm32-unknown-unknown",
"execution": "generated_filter_project_v1",
"artifact_bytes": 1488,
"operator_count": 3,
"operators": ["scan", "filter", "project"],
"supports_edge_package": true,
"supports_simd": false
}
wasm.plan.inspect¶
Params:
{"artifact_b64":"..."}
Result:
{
"format": "skein.wasm.plan.v1",
"abi": "skein.wasm.batch.v1",
"target": "wasm32-unknown-unknown",
"execution": "generated_filter_project_v1",
"artifact_bytes": 1488,
"operator_count": 3,
"operators": ["scan", "filter", "project"],
"table": {"db":"app","table":"users"},
"has_filter": true,
"projection_count": 2,
"supports_edge_package": true,
"supports_simd": false
}
wasm.plan.perf_report¶
Params:
{
"artifact_b64": "...",
"args": [{"t":"u64","v":7}],
"iterations": 5,
"warmup_iterations": 1
}
Result:
{
"format": "skein.wasm.plan.perf.v1",
"execution": "generated_filter_project_v1",
"iterations": 5,
"warmup_iterations": 1,
"operators": ["scan", "filter", "project"],
"outputs_match": true,
"simd": {
"candidate": true,
"enabled": false,
"strategy": "scalar_generated_filter_project_v1",
"notes": ["SIMD lane lowering is not emitted by this build"]
},
"host": {"rows": 1, "columns": 2, "latency": {"p50_ns": 1000}},
"generated": {"rows": 1, "columns": 2, "latency": {"p50_ns": 1200}},
"generated_speedup_vs_host": 0.83
}
This report is intentionally an exploration/perf-test baseline. It identifies fixed-width u64/bool generated artifacts as SIMD candidates and compares scalar generated Wasm against the host interpreter, but supports_simd remains false until SkeinDB ships a production SIMD-lowered codegen path.
wasm.plan.edge_package¶
Params:
{
"artifact_b64": "...",
"package_name": "users-score-plan"
}
Result:
{
"format": "skein.wasm.edge_package.v1",
"package_name": "users-score-plan",
"artifact_b64": "...",
"artifact_bytes": 392,
"artifact_sha256": "...",
"manifest_json": "{...}",
"runner_js": "export async function runSkeinWasmPlan(...) { ... }",
"instructions": [
"Store artifact_b64 and manifest_json with the edge worker or browser bundle.",
"Call runSkeinWasmPlanEdge with artifact_b64, input_rows, args, and result_format to execute the embedded generated Wasm module locally."
]
}
The v1 edge package ships the plan artifact, a manifest, and a JavaScript runner with two execution paths:
- runSkeinWasmPlanEdge(...) executes generated_filter_project_v1 artifacts locally in the edge/browser process. Callers provide rows for the artifact's input_table_columns; the runner encodes them as skein.wasm.batch.v1, appends typed args as parameter columns, runs the embedded module with WebAssembly.instantiate, and decodes the output batch.
- runSkeinWasmPlanHost(...) calls wasm.plan.run on a SkeinDB host for host_interpreted_v1 artifacts or deployments that prefer server execution.
runSkeinWasmPlan(...) chooses local execution when input_rows / inputRows or input_batch_b64 / inputBatchB64 is supplied for a generated artifact, otherwise it uses the host fallback.
wasm.plan.run¶
Params:
{
"artifact_b64": "...",
"args": [{"t":"u64","v":7}],
"result_format": "objects_json",
"cache": {"want_etag": true},
"wire": {"format": "skeinpack_v1"}
}
Result: the same envelope as query.select (QueryExecResult).
When result_format: "wasm_batch_v1" is used, data contains a columnar batch:
{
"format": "skein.wasm.batch.v1",
"columns": [ {"name":"id","type":{"kind":"u64"}} ],
"batch_b64": "..."
}
6) Prototype notes¶
Current implementation:
- Fixed-width non-null u64/bool scan/filter/project plans compile to embedded Wasm modules and run through Wasmtime on the server (execution: "generated_filter_project_v1").
- Unsupported plans remain portable through host interpretation (execution: "host_interpreted_v1").
- Only the scan/filter/project subset is accepted.
- abi and target are validated; target is recorded in the artifact for inspection and packaging.
- wasm.plan.inspect exposes artifact metadata without running the query.
- wasm.plan.edge_package emits standalone JavaScript edge execution for generated artifacts and a host-backed fallback for interpreted artifacts.
- wasm.plan.perf_report compares host and generated scalar execution, verifies output parity, and records SIMD candidate notes for future SIMD-lowered codegen work.