Engineering roadmap

Last updated 2026-04-19. Tracked in docs/PROJECT_BACKLOG.md.

SkeinDB is shipped one small, test-backed slice at a time. Each phase below groups a logical area; every item in the backlog ships with unit + integration tests and documentation updates before it is merged.

Phase 0 — Repo setup

Done

Phase 1 — Storage core

Active

Phase 2 — SQL + virtual metadata

Done

Phase 3 — MySQL protocol

Done

Baseline handshake, COM_QUERY, COM_STMT parity, WordPress-class workload coverage, and an extensive scalar/date-time/JSON function set. Follow-up parity work continues through corpus growth.

Phase 4 — Web console

Done

HTTP API (/api/v1/sql/exec), schema browser, SQL editor, data browse/edit + CSV/JSON import/export, users/privileges + status dashboard.

Phase 5 — SkeinQL native API

Done

Native SkeinQL RPC (/api/v1/rpc) with schema.*, query.*, tx.*, system.*.

Phase 6 — Cache-coherent HTTP queries

Done

Row ETags, If-Match/If-None-Match, query.prepare, SSE subscriptions.

Phase 7 — Delta-chained values

Done

DELTA value kind, selection policy + metrics, compaction rebase bounded by chain depth.

Phase 8 — Wasm extensions

Done

Phase 9 — Tamper-evident WAL audit

Done

WALHeader v2, hash chaining, checkpoint anchors, verify CLI/API.

Phase 10 — Hybrid row/column snapshots

Active

T100, T101, and T102 are in place: snapshot builds materialize rows at the requested snapshot timestamp, emit manifest plus .cseg sidecars, simple single-table SELECTs can read through a column-scan cursor, and an explicit rule now routes only eligible covered reads into snapshots. Hybrid merge planning remains open.

Phase 11 — Compatibility telemetry

Done

Phase 12 — SkeinAdmin standalone console

Done

Phase 13 — Observability & server load stats

Done

Phase 14 — Cluster management

Active

Node identity, replication transport, CAS pull, read-only replicas, cluster.* RPCs, sharding metadata router (single-shard txns).

Phase 15 — Additional performance improvements

Active

T150 through T153 are now in place: interned-column schema flags persist separately from the catalog, single-table scan paths can precompile `eq`/`ne`/`in` predicates into ValueID comparisons, row/snapshot executors only materialize the columns the query still needs, eligible full scans now run through a 1024-row scan->filter->project batch loop, and the core MVCC layer exposes a validated visible-version cache keyed by row id plus snapshot bucket.

How to follow

The most authoritative state is the GitHub repository. Pinned reading order: