SkeinDB Project Backlog¶
This backlog is designed for small PR-sized tasks. Each task should include tests.
Reality sync (2026-05-27)¶
This file is the core roadmap task inventory. It is not the best place to read current runtime maturity at a glance.
[x]= implemented and exercised in runtime/tests.[ ]= still open.- All 140 top-level core roadmap checkboxes are currently closed.
- Remaining partial areas and non-hardened work are tracked in
docs/TRUE_STATUS_MATRIX.md, compatibility docs, anddocs/RESEARCH_BACKLOG.md.
For the short implemented-vs-partial snapshot, see docs/TRUE_STATUS_MATRIX.md.
Phase 0 — Repo setup¶
- Status: complete in runtime + tests (
crates/skeindb-core/src/lib.rs,crates/skeindb-core/tests/phase0_format.rs) - [x] T001: Encoding primitives (VarU, Bytes/String, CRC32C)
- [x] T002: FileHeader read/write
- [x] T003: RecordFrame append/iterate
Phase 0 verification checklist:
- [x] T001 evidence: VarU and hash/CRC tests (tests::varu_roundtrip_*, tests::value_id_is_stable, tests::audit_hash_is_stable) in crates/skeindb-core/src/lib.rs
- [x] T002 evidence: FileHeader encode/decode + corruption tests in crates/skeindb-core/src/lib.rs and file roundtrip in crates/skeindb-core/tests/phase0_format.rs
- [x] T003 evidence: RecordFrame append/decode/iterate + truncation/CRC tests in crates/skeindb-core/src/lib.rs and file-backed iteration in crates/skeindb-core/tests/phase0_format.rs
Phase 1 — Storage core¶
- [x] T010: MANIFEST.log reader/writer. Latest:
crates/skeindb-core/src/manifest.rsnow provides a typed append-only MANIFEST implementation overFileHeader(FileKind::Manifest)+RecordFramepayloads with five v1 record variants (AddFile,RemoveFile,SetCurrentVersion,SetLastLsn,CleanShutdown), replayableManifestStatederivation, and file-backedManifestWriter/ManifestReaderAPIs. Unit tests cover record encode/decode for all variants, unknown-tag/file-kind rejection, state-apply behavior (add/update/remove), file roundtrip, writer reopen replay, and bad-header rejection. - [x] T011: WAL writer/reader + recovery. Latest:
crates/skeindb-core/src/wal.rsnow implementsFileHeader(FileKind::Wal)+RecordFrameWAL files with typed v1 body records (BEGIN_TXN,MUTATION,COMMIT_TXN,ABORT_TXN) layered on the existing WAL header prefix, strict read-all and lenient committed-transaction recovery, and a file-backedWalWriterthat truncates torn/corrupt tails before appending. Recovery emits only committed txns in log order, discards aborted txns, and reports truncated tail bytes. Unit tests cover v1/v2 record decode/roundtrip, unknown record rejection, committed-only recovery, torn-tail truncation on reopen, and bad-header rejection; integration tests cover file-backed roundtrip and truncated-tail recovery. - [x] T012: ValueStore (.vseg) append/read + ValueID. Latest:
crates/skeindb-core/src/valuestore.rsnow addsValueSegmentWriter/ValueSegmentReaderoverFileHeader(FileKind::ValSeg)+ framed VE1 records,ValueStore::write_segment_file/load_segment_fileconvenience helpers, explicitValueIdpreservation for raw and custom-ID entries, and DELTA persistence viaDELTA1payloads stored inside VE1raw_bytes. Loading recomputes delta depths, validates materialization through the existing hash-checked reconstruction path, and rebuilds learned-index state in memory. Integration tests cover file-backed roundtrip with RAW + DELTA + custom-ID EMBEDDING entries, writer reopen append semantics, and bad-header rejection. - [x] T013: Sorted runs (.run) + simple LSM (memtable + level0). Latest:
crates/skeindb-core/src/run.rsnow implements immutable.runfiles overFileHeader(FileKind::Run)with typedDataBlock/IndexBlock/ footer encoding, aRunWriterthat enforces strictly increasing keys and block-splits by target size, aRunReaderwith full-scan and binary-searched point lookups, and aSimpleLsmthat keeps a memtable in aBTreeMap, flushes torun-######.run, and reads level0 newest-first. Tests cover out-of-order writer rejection, bad-header rejection, multi-block file roundtrip, memtable flush + overwrite visibility across multiple level0 runs, and reopen/discovery of existing runs. - [x] T014: RowSeg (.rseg) + RowVersion encoding. Latest:
crates/skeindb-core/src/rowseg.rsnow implements the RV1 row-version record (rec_type=0x10,rec_ver=1) overFileHeader(FileKind::RowSeg)+RecordFrameframing, with aFilePtr(12-bytefile_id+offset) for MVCC chain pointers,RowGroupRef::Inline/RowGroupRef::ValueIdgroup payloads, strict decode (rejects trailing bytes, unknown rec_type/rec_ver, unknown group ref kinds), andRowSegmentWriter::{create,open,append→FilePtr,sync}/RowSegmentReader::{open,read_all,read_at}APIs so callers can chain per-row version histories. Unit tests cover RV1 encode/decode roundtrips, bad header/type/version/trailing-bytes rejection, delete-flag handling, andFilePtrsentinel semantics; integration tests cover file-backed write/read roundtrips with chainedprev_ptr, reopen/append semantics, mixedInline/ValueIdgroups, and rejection of wrong-FileKindfiles. - [x] T015: RowDir (row_id -> head ptr). Latest:
crates/skeindb-core/src/rowdir.rsnow provides an in-memoryBTreeMap<u64, RowDirEntry>mappingrow_idtoFilePtrhead pointers, withLive/Tombstonevariants so deletions can shadow older-run state. Persistence reuses the existing.runformat:flush_to_runwrites sorted big-endianrow_idkeys with[tag=0][FilePtr(12)]live values or[tag=1]tombstones;load_from_run/from_runreplays them, applying tombstones as removals and live entries as upserts. Unit tests cover put/get/remove/forget, sorted iteration, live/tombstone encode-decode, and bad tag/length rejection. Integration tests (crates/skeindb-core/tests/rowdir.rs) cover file-backed roundtrip preserving row_id order, tombstone-in-newer-run shadowing older-run live entries, live-in-newer overriding live-in-older, empty-run roundtrip, and tombstone-only run shadowing multiple live entries. - [x] T016: MVCC visibility. Latest:
crates/skeindb-core/src/mvcc.rsnow resolves snapshot visibility by walkingRowVersion::prev_ptrchains over.rsegrecords loaded throughRowSegmentReader, usingbegin_ts != 0 && begin_ts <= snapshot_ts && (end_ts == 0 || end_ts > snapshot_ts)as the committed-visibility rule and treatingSnapshot::latest()as+INF.resolve_headstarts from an explicitFilePtr,resolve_row_idstarts fromRowDir::get(row_id), andMvccLookupdistinguishes missing rows, visible live rows, and visible delete markers. The module includes aRowVersionResolvertrait plusRowSegmentSetfor file-backed multi-segment lookup, cycle / row-id / table-id mismatch detection, and staged-row skipping (begin_ts == 0). Unit tests cover snapshot-window evaluation, delete-marker visibility, historical chain walking, staged-head fallback, and cycle / row-id mismatch detection; integration tests (crates/skeindb-core/tests/mvcc.rs) cover file-backed delete-marker history, RowDir-based lookup across multiple segment files, and staged-head fallback to the previous committed version.
Phase 2 — SQL + virtual metadata¶
- [x] T020: Catalog schema + TableDef
- [x] T021: information_schema.tables + columns
- [x] T022: Minimal executor: CREATE TABLE, INSERT, SELECT scan+filter+limit
Phase 3 — MySQL protocol¶
- Status: baseline protocol plus broad COM_QUERY / COM_STMT compatibility are implemented in runtime/tests; follow-up parity work now continues through corpus growth and later backlog phases rather than open Phase 3 checkboxes.
- [x] T030: Handshake + mysql_native_password
- [x] T031: COM_QUERY SELECT literals
- [x] T032: SQL translator (subset)
- [x] T033: DDL/DML subset for corpus.sql
- [x] T034: SQL_CALC_FOUND_ROWS + FOUND_ROWS
- [x] T035: Index-backed secondary/unique index enforcement for MySQL duplicate-key semantics (runtime duplicate-key checks now reuse the shared secondary-index cache, including
PRIMARY KEY-changingUPDATEs; MySQL duplicate-key failures now surface as1062/23000on the wire; creating a MySQL compatibilityUNIQUE INDEXstill rejects pre-existing duplicate rows; and per-table secondary-index cache metadata now persists/reloads on reopen) - [x] T036: Broaden COM_QUERY parity for WordPress-class workloads (the MySQL listener now covers the checked-in WordPress-style corpus and companion integration tests, including grouped/simple aggregate shims with
HAVING, projection-groupedGROUP BYde-dup including wildcard projections after expansion,SQL_CALC_FOUND_ROWS, wildcard join projections, top-level comma joins and left-associative join chains, parenthesized boolean predicates, index DDL, bootstrap/session compatibilitySETandSHOWforms, recursive/nested compatibility rewrites for the current subquery subset, and compatibility no-opLOCK TABLES/UNLOCK TABLES) - [x] T037: Deepen COM_STMT parity beyond the current baseline (complex-query result metadata, stricter driver/cursor semantics, fuller protocol coverage; prepare-time metadata now also covers supported scalar-expression projections including baseline arithmetic, broader scalar/date-time functions including
FIND_IN_SET/ISNULL,DATE_FORMAT/FROM_UNIXTIME,DATEDIFF/TIMESTAMPDIFF,WEEKDAY/DAYOFWEEK/DAYOFYEAR,MONTHNAME/DAYNAME,QUARTER,LAST_DAY,EXTRACT(<unit> FROM ...), and baseline interval arithmetic throughDATE_ADD/DATE_SUB/TIMESTAMPADD, supported subquery-compatSELECTs whoseWHEREclauses rewrite cleanly, including the currentIN/EXISTS/ simple scalar-compare subset, the current nested compatibility path, limited negated boolean-tree wrappers when they can still rewrite cleanly, supported projection-level scalar subqueries, embedded scalar-subquery arithmetic, plusCASE/CASTplus simple aggregate / grouped-aggregate compatibility queries). Progress: the new scalar/date-time functions,COM_INIT_DB, andCOM_STATISTICSwire commands broaden the prepared-statement surface, and the latest slice adds dedicated unit + MySQL-wire regressions for projection-subquery metadata parity. - [x] T038: Broaden COM_QUERY beyond the current WordPress-class baseline (deeper correlated/nested subqueries beyond the current recursive
IN/EXISTS/ simple scalar-compare compatibility path, broader join parity beyond the current left-associativeONplus simple base-tableUSINGsubset, broader date/time/function parity beyond the current scalar/date-time baseline, and broaderALTER TABLEvariants beyond the currentADD/MODIFY/CHANGE/RENAME COLUMN/RENAME [KEY|INDEX]/RENAME TO/DROP COLUMNplus index metadata surface). Progress: significant surface expansion — addedBETWEEN/NOT BETWEEN,COUNT(DISTINCT col),GROUP_CONCAT(),INSERT ... SELECT,UNION/UNION ALL,TRUNCATE TABLE,DROP DATABASE,RENAME TABLE,EXPLAINstub,DO,SAVEPOINTstubs,CREATE VIEW/DROP VIEWstubs, locking hint stripping, session functions (USER(),LAST_INSERT_ID(),CONNECTION_ID()),information_schema.schemata/statistics, expandedSHOWcommands (WARNINGS, ERRORS, PROCESSLIST, TRIGGERS, EVENTS, PROCEDURE STATUS, FUNCTION STATUS, PLUGINS, PROFILES, CREATE DATABASE),SET GLOBAL/FLUSH/ANALYZE/OPTIMIZE/CHECK/REPAIR/KILLno-ops, and 30+ additional scalar/date-time functions. Corpus expanded from 772→947 lines (283 statements). Latest batch: derived tables (FROM subqueries), CTEs (WITH...AS),REGEXP/RLIKE/NOT REGEXP,<=>(NULL-safe equality),NATURAL JOIN,FULL OUTER JOIN(fully executed), multi-tableDELETE, multi-tableUPDATE(stub), 11 JSON functions (JSON_EXTRACT,JSON_UNQUOTE,JSON_OBJECT,JSON_ARRAY,JSON_CONTAINS,JSON_LENGTH,JSON_TYPE,JSON_VALID,JSON_SET,JSON_KEYS,JSON_MERGE_PRESERVE), plusFIELD/ELT,INET_ATON/INET_NTOA,BIN/OCT/CONV, and hash functions (CRC32,MD5,SHA1/SHA,SHA2). Corpus now at 1130 lines (over 374 statements). Latest batch: multi-columnGROUP BYwith multiple group columns and aggregates, 12 new scalar functions (SUBSTRING_INDEX,ASCII,ORD,CHAR,STRCMP,BIT_LENGTH,OCTET_LENGTH,REGEXP_REPLACE,REGEXP_SUBSTR,TO_BASE64,FROM_BASE64), 5 newinformation_schemastub tables (routines,triggers,views,processlist,user_privileges). Latest batch: window functions (ROW_NUMBER()/RANK()/DENSE_RANK()withOVER(PARTITION BY ... ORDER BY ...)),SET @var = .../SELECT @varuser variables,BIT_AND()/BIT_OR()/BIT_XOR()bitwise aggregates, multi-tableUPDATE(upgraded from stub to real per-row implementation), 6 new scalar functions (DEGREES,RADIANS,PERIOD_ADD,PERIOD_DIFF,MAKEDATE,MAKETIME). Corpus now at 1240+ lines (over 370 statements). Latest batch: corpus.sql fully expanded — all 16+ TODO blocks uncommented (IF/NULLIF, EXISTS, REGEXP, CAST, COUNT DISTINCT, INFORMATION_SCHEMA, LOCK/UNLOCK, window functions, CTEs, RIGHT/CROSS JOIN, derived tables, NOT EXISTS, IN/scalar subquery, multi-table DELETE/UPDATE, nested functions, SHOW PROCESSLIST/PLUGINS). ~60 new SQL statements added covering additional JOINs, INSERT...SELECT, DO, EXPLAIN, SHOW variants, system variables (@@version etc.), maintenance no-ops, CREATE/DROP VIEW, SAVEPOINT, GROUP_CONCAT DISTINCT, multi-column GROUP BY, session functions, locking hints, scalar functions, SET GLOBAL. Corpus now at 1657 lines (about 678 semicolon-terminated SQL statements) after the fully expanded compatibility sweep. Latest batch: correlated subqueries in projection (SELECT name, (SELECT COUNT(*) FROM orders WHERE user_id = users.id) FROM users), binary comparison operators in scalar expressions (>,<,>=,<=,=,!=,<>), multi-aggregate GROUP BY with ORDER BY support over JOINs, embedded subquery pre-evaluation in arithmetic expressions (salary - (SELECT AVG(salary) FROM users)), expression-based UPDATE SET values with per-row evaluation (UPDATE users SET salary = salary * 1.1 WHERE ...viadata_update_exprsengine method), WordPress Site Health-styleinformation_schema.TABLESstorage summaries, WordPress Users-screen role counts viaCOUNT(NULLIF(<predicate>, false)), and dedicated MySQL-wire regressions for WordPress installer/admin seed queries. A fresh live WordPress admin sweep across the core dashboard/content/settings surfaces now finishes with an emptydebug.log; only the theme-ownednav-menus/widgetspages still return non-database 500s. Deeper parity work is still ongoing.
Phase 4 — Web console¶
- [x] T040: HTTP API
/api/v1/sql/exec - [x] T041: Console UI scaffold
- [x] T042: Schema browser + SQL editor
- [x] T043: Data browse/edit + import/export (CSV + JSON export/import)
- [x] T044: Users/privileges + status dashboard
Phase 5 - SkeinQL native API¶
- [x] T050: Define SkeinQL request/response types + error model (docs/SKEINQL.md)
- [x] T051: Implement HTTP RPC endpoint POST /api/v1/rpc (system.ping, system.version)
- [x] T052: Implement schema.* methods (list/describe/create/drop)
- [x] T053: Implement query.select (single-table scan + filter + limit) over SkeinIR
- [x] T054: Implement tx.begin/commit/rollback via SkeinQL
Phase 6 - Cache-coherent HTTP queries (ETags)¶
- [x] T060: Row ETags for data.get and If-Match support for data.update
- [x] T061: Planner dependency sets for simple indexed queries
- [x] T062: query.prepare + GET /api/v1/q/{query_id} with ETag/If-None-Match
- [x] T063: SSE subscription to ETag changes (query.subscribe)
Phase 7 - Delta-chained values¶
- [x] T070: Add ValueEntry kind DELTA + patch codec (docs/DELTA_VALUES.md)
- [x] T071: Delta selection policy + metrics
- [x] T072: Compaction rebase (limit delta chain depth)
Phase 8 - Wasm extensions¶
- [x] T080: Module store + catalog metadata for UDFs (docs/WASM_UDFS.md). Latest:
crates/skeindb-core/src/wasm_catalog.rsnow stores Wasm modules as immutableValueKind::BlobChunkvalues insideValueStoreand tracks typed metadata in aWasmModuleCatalogpersisted aswasm_catalog.json(format v1). Catalog entries includemodule_id, optionalname, UDF kind (scalar/aggregate/table), ABI string, entrypoint symbol,ValueId, size, creation timestamp, and capability metadata (allowed_hostcalls, per-table read/write permissions, determinism, fuel/memory/output budgets). The core API supports install/list/get/drop, overwrite-on-install, module-byte materialization back out ofValueStore, strict validation for empty ids/entrypoints/ABIs/modules, and strict JSON load validation for unsupported format versions or malformedvalue_idhex. Unit tests cover install/overwrite/drop flows, invalid-request rejection, and value-id hex roundtrip; integration tests cover catalog +.vsegroundtrip preserving both metadata and module bytes plus JSON-load rejection for bad format versions and malformedvalue_idstrings. - [x] T081: Scalar UDF execution sandbox with resource limits. Latest:
crates/skeindb-core/src/wasm_udf.rsnow executes scalar Wasm modules directly fromWasmModuleCatalog+ValueStoreviawasmtime, using theskein.wasm.udf.v1ABI with host-side value encoding, a required exportedmemory,skein_alloc(len)allocator, and a scalar entrypoint export (typicallyskein_scalar(ptr, len) -> u64, returningptr<<32 | len). The sandbox enforces capability-gated imports (currently onlyskein.log_debugmapped fromallowed_hostcalls = ["log.debug"]), memory size limits via store resource limiting, output byte limits viamax_output_bytes, and keeps filesystem/network/clock/random unavailable by exposing no such imports. Unit tests cover scalar value encode/decode roundtrips and trailing-byte rejection; integration tests cover constant-value execution, hostcall denial/allow flows forlog.debug, output-size rejection, and memory-growth failure beyond the configured limit. - [x] T082: Safe cancellation (fuel/time budget) + tests. Latest:
crates/skeindb-core/src/wasm_udf.rsnow enforces per-modulemax_fuelbudgets via Wasmtime fuel metering when configured, adds a bounded wall-clock deadline via epoch interruption, and surfaces explicitFuelExhausted/TimeoutExceedederrors instead of collapsing cancellation into generic execution failures. The defaultexecute_scalar_udf(...)path keeps a conservative host timeout, whileexecute_scalar_udf_with_options(...)allows embedders and tests to override it. Integration tests now cover deterministic infinite-loop cancellation on fuel exhaustion, host timeout cancellation whenmax_fuel = 0, and successful execution of a later UDF after a cancelled call. - [x] T083: Aggregate and table-function UDFs. Latest:
crates/skeindb-core/src/wasm_udf.rsnow executes aggregate and table Wasm modules through the same Wasmtime sandbox used for scalar UDFs. Aggregate modules consume a one-shot encoded row batch viaexecute_aggregate_udf(...)and return a single encoded value; table modules consume encoded args viaexecute_table_udf(...)and return an encoded row set. Sharedencode_rows(...)/decode_rows(...)helpers define the row-batch ABI, and the runtime reuses the existing capability checks, memory/output limits, and T082 fuel/timeout cancellation path for all three module kinds. Unit tests now cover row-batch encode/decode roundtrips; integration tests cover an aggregate module that sums row values and a table module that materializes rows from input arguments.
Phase 9 - Tamper-evident WAL audit¶
- [x] T090: WALHeader v2 with hash chaining (docs/AUDIT_WAL.md)
- [x] T091: checkpoint anchors + audit status
- [x] T092: audit verify CLI/API + console page. Latest: SkeinAdmin's Forensics panel now exposes
maintenance.audit_statusandmaintenance.audit_verifyalongside filteredforensic.query, proof-backedforensic.verify, andforensic.exportreport bundle tools.
Phase 10 - Hybrid row/column snapshots¶
- [x] T100: Snapshot builder (scan MVCC at snapshot_ts) + cseg writer (docs/COLUMN_SNAPSHOTS.md). Latest: snapshot builds now honor
snapshot_ts, persistmanifest.json+.csegsidecars underdata/snapshots/, and keep those artifacts in sync during incremental refresh. - [x] T101: Snapshot reader + column scan operator. Latest: simple single-table SELECT execution now loads projected and PK columns from snapshot manifests and
.csegsidecars instead of cloning in-memory snapshot rows. - [x] T102: Optimizer rule: use column snapshots for covered ranges. Latest: an explicit optimizer rule now chooses snapshot scans for covered current-time single-table SELECTs, including covered
DISTINCTprojections, projection-onlyGROUP BY, compatibleHAVINGover grouped projected columns or aliases, and broad equality-index prefilter shapes, when the cost model beats both a full row scan and the competing row-side candidate scan.
Phase 11 - Compatibility telemetry and migration hints¶
- [x] T110: Feature flag instrumentation in MySQL translator
- [x] T111: Internal storage for telemetry counters + query fingerprints (optional)
- [x] T112: telemetry.compat_summary endpoint + console dashboard
- [x] T113: telemetry.migration_hints generator (MySQL patterns -> SkeinQL calls)
Phase 12 - Standalone management console (SkeinAdmin)¶
- [x] T120: SkeinAdmin placeholder scaffold (web/skeinadmin) + connection profiles
- [x] T121: SkeinAdmin pages: schema/data/sql workspace
- [x] T122: SkeinAdmin security: token UI + role-aware navigation. Latest: dedicated Security panel remains reachable from both sidebar and top-tab navigation, with create/list/revoke token flows using modal confirmations instead of browser dialogs.
- [x] T123: SkeinAdmin cluster page (cluster.*) + actions. Latest: join/leave/remove/promote controls are all surfaced in the live cluster panel.
- [x] T124: SkeinAdmin observability page (stats.*) — comprehensive dashboard with runtime, storage/dedup, MVCC/compaction, query/cache stats + auto-refresh
- [x] T125: SkeinAdmin Easy Viewer (phpMyAdmin-inspired) — sidebar tree, sub-tabs, inline editing, search, export, operations. Latest: inline New DB flow, live create-table SQL preview, duplicate-column / identifier validation before create, required-field validation before insert, column sorting (click-to-sort headers), styled modal confirmations (replacing browser confirm()), search operator dropdown (LIKE/=/!=/>/</BETWEEN/IS NULL/IS NOT NULL/REGEXP), visual query builder tab (column picker, WHERE condition builder, ORDER BY/LIMIT, SQL preview, execute/copy/send), 5 new dashboard cards (Top Tables, Slow Query Log, Active Sessions, Index Health, Research Track Status). 2026-04-25 (v0.3.4): wired the previously-stubbed Top Tables / Slow Query Log / Active Sessions / Index Health cards to live RPCs (
information_schema.tablesviasql.exec,stats.slow_queries,stats.snapshot) through newsilentRpc/unwrapCellValuehelpers; fixed threer?.secret/r?.tokens/r?.queriesresponse-shape bugs insecurityCreateToken/securityRefreshTokens/securityTopQueriesso the panel now reads fromr.json.result.*; reordered the create-token flow so the fresh secret is no longer overwritten by the subsequent token list refresh; added auto-refresh on overview/security panel switches; relabeled the Active Sessions card to matchstats.snapshotfields (Sessions/Open Txns/Avg Latency). - [x] T126: SkeinAdmin Engine Config panel — checkbox toggles for dedup, compression, encryption, MVCC, delta chains, time travel, compaction, cache, security, replication, CDC, QUIC. Latest: storage mode selector is aligned with the real runtime values
json,segment, andhybrid.
Phase 13 - Observability and server load statistics¶
- [x] T130: stats.snapshot and basic counters in server. 2026-05-25:
stats.snapshotnow also synthesizes a basicalertsblock from the existing query tail-latency, CDC backpressure, and compaction-pressure telemetry so operators can see current warning/critical conditions without scraping multiple subtrees. 2026-05-27: settings-backedobservability.alert_routesnow annotatestats.snapshot.alertswith matched route IDs/targets plusrouting.{configured,routed_alerts,matched_routes}summary metadata. Matchedhttp://andhttps://targets now receive a JSONPOSTonce per active alert while repeated polls are suppressed until the alert clears, with per-route and top-level delivery counters exposed in the snapshot. Coverage:server::tests::stats_snapshot_routes_operator_alerts_from_settings,server::tests::stats_snapshot_delivers_http_alert_routes_once_per_active_alert. - [x] T131: query fingerprinting + top_queries / slow_queries
- [x] T132: GET /metrics (Prometheus-style) + labels
- [x] T133: Console widgets for CPU/memory/disk/QPS/TPS/compaction — Overview dashboard with stat cards, dedup bar chart, auto-refresh. 2026-04-25 (v0.3.5):
stats.snapshotnow exposes realstorage.total_rows/storage.total_tables/storage.disk_bytes(computed by walking the data dir) plus new top-levelmvcc.{versions, delta_chains}andcache.{hit_pct, size_bytes, hits, misses}sections, and a newetag_hitscounter increments on bothIf-None-Matchpaths inGET /api/v1/q/{id}— the dashboard cards that previously rendered "--" now show live values. 2026-05-25: the Overview panel now renders anOperational Alertscard fromstats.snapshot.alerts, summarizing current query tail-latency, CDC backpressure, and compaction-pressure warnings/criticals in one place.
Phase 14 - Cluster management and scale-out¶
- [x] T140: Node identity (node_id) + cluster config model
- [x] T141: Replication transport protocol (primary -> replica fanout over SkeinQL RPC)
- [x] T142: CAS object pull protocol (replica fetch missing ValueIDs; objects.need/missing/fetch RPCs + Bloom contains)
- [x] T143: Read-only replica serving + router (cluster.route_query RPC + replica write rejection)
- [x] T144: cluster.* SkeinQL endpoints + join tokens + promote replica
- [x] T145: Sharding metadata + router prototype (single-shard txns)
Phase 15 - Additional performance improvements¶
- [x] T150: Schema flag for interned columns + ValueID-first predicate ops (docs/PERFORMANCE.md)
- [x] T151: Late materialization (decode only projected columns)
- [x] T152: Batch (vectorized) scan/filter/project pipeline
- [x] T153: MVCC Visible Version Index cache. Latest:
crates/skeindb-core/src/mvcc.rsnow exposes a boundedVisibleVersionIndexkeyed byrow_id + snapshot_epoch_bucket, validates cached entries against the currentRowDirhead pointer plus the cached version's exact visibility window before reuse, and falls back to normal chain walking on head changes or same-bucket timestamp drift. Unit tests cover cache hits, same-bucket revalidation, head-change invalidation, and bounded eviction; integration tests cover file-backed reuse overRowSegmentSet+RowDir.
Phase 16 - Query coalescing (thundering herd protection)¶
- [x] T160: Query fingerprint canonicalization (SkeinIR + SkeinQL) + auth scope keying
- [x] T161: In-flight query map (leader/joiner) with cancellation semantics
- [x] T162: Enable coalescing for GET /api/v1/q/{query_id} (cacheable) + tests
- [x] T163: Metrics + limits + SkeinAdmin dashboard widget
Phase 17 - CAS-aware replication bandwidth bounds (object-aware sync)¶
- [x] T165: Bloom summaries for ValueID existence (per valseg + union)
- [x] T166: Object pull protocol (batch missing ValueIDs, fetch objects, verify hashes). Latest: added replica-side
objects.pull, which batches locally-missing ValueIDs, calls remoteobjects.fetch, validates a lossless transferred VE1 payload (entry_b64) against the requested ValueID before import, and recursively fetches missing delta-base dependencies so pulled entries remain materializable after ingest. Tests cover batch fetching with local-hit skipping, delta-base dependency pulling, and hash-mismatch rejection. - [x] T167: Replication metrics: object hit-rate, saved bytes, ref-bytes vs obj-bytes. Latest: added
ReplicationObjectCountersto the server counters, instrumentedobjects.need/objects.missing/objects.fetchwith hit/miss accounting and byte accounting (hits accumulateref_bytes, fetches accumulateobj_bytes), exposed a newcluster.replication_statsRPC (read-only, capability-listed) reportingneed_*,missing_*,fetch_*,ref_bytes,obj_bytes,hit_rate,saved_bytes_ratio, andlast_updated_ms, and embedded the same JSON understats.snapshot.cluster.replication_objects. One end-to-end integration test (cluster_replication_stats_tracks_hits_misses_and_bytes) verifies the counters advance correctly across seed → need → missing → fetch → stats.snapshot. - [x] T168: Shard move/rebalance uses object manifests + progress reporting. Latest:
cluster.shard.move/cluster.shard.rebalancenow build shard-scoped object manifests from live row versions, ask the destination node which ValueIDs are missing viaobjects.need, pull only the missing objects viaobjects.pullbefore updating placement, and return manifest/progress summaries including total/missing object counts and bytes plus batch/pull/store outcomes. Tests cover engine-side manifest deduplication and an end-to-end shard move that transfers missing objects to a destination node.
Phase 18 - Self-tuning index advisor¶
- [x] T170: Telemetry feature extraction (predicates/order/group/join keys) + privacy-safe storage
- [x] T171: Candidate index generator + duplication/prefix checks. Latest: advisor synthesis now suppresses exact duplicates, primary-key prefixes, prefixes already covered by existing MySQL-compatible indexes, and any suggestion IDs that were previously applied or dismissed.
- [x] T172: Benefit estimator (Level 0 rule-based) + SkeinQL advisor.* endpoints. Latest:
advisor.evaluatenow emits measured before/after latency stats for benchmarkable equality, join-key filters, multi-range filters, narrow order-by, grouped phases including mixed range/order/group layouts, and non-grouped same-leading range+order by comparing live full scans against a hypothetical advisor-built secondary index; non-grouped range+order layouts without a same-leading key still fall back to observed-before / expected-after scan summaries. - [x] T173: Apply suggestion (CREATE INDEX) + progress + rollback-on-failure. Latest:
advisor.apply_indexnow queues background secondary-index builds,advisor.historyrecords queued/building/completed/failed lifecycle state with progress percentages, and failed builds record rollback metadata before the suggestion can surface again. - [x] T174: SkeinAdmin "Index Advisor" page + before/after performance report
Phase 19 - Time travel and replay bundles¶
- [x] T180: MVCC as_of reads (planner + executor) + SkeinQL
as_ofparameter (docs/TIME_TRAVEL_REPLAY.md) - [x] T181: SQL compatibility surface for as_of reads (session variable + query hint). Latest:
MySqlSessionStatenow carries askein_as_of_msfield;SET @@skein.as_of = '<iso>' | <epoch_ms> | NULL | DEFAULTparses ISO-8601 (withZ/±HH:MMoffsets, fractional seconds) and integer epoch-milliseconds values to control time-travel reads for subsequent SELECTs. Optimizer-style query hint/*+ SKEIN_AS_OF('<ts>') */extracted and stripped insql_execbefore parsing to override the session value per-statement. Both forms thread through to the MVCC-awarequery_selectas_of filter (T180). 5 unit tests (parse_as_of_timestamp_accepts_iso_and_epoch_forms,parse_skein_as_of_assignment_value_handles_null_default_and_iso,extract_skein_as_of_hint_strips_hint_and_returns_epoch_ms, plus 2 integration tests) cover parsing, session SET/clear, and SELECT filtering via both hint and session variable. - [x] T182: History retention policy + garbage collection for old versions. Latest: new
maintenance.history.*RPC surface —maintenance.history.statusreports per-table live/tombstone/purgeable counts plusoldest_tombstone_commit_ts_ms;maintenance.history.set_policypersistshistory.retention.enabledandhistory.retention.window_msvia the settings subsystem;maintenance.history.gcpurges MVCC tombstones whosecommit_ts_ms <= horizon(explicit params or derived from retention policy). Pre-T180 tombstones (commit_ts_ms == 0) are always retained for safety. GC rebuilds thepk_index, bumpstable_versionso secondary indexes refresh lazily, clears cached vector indexes, and persists each touched table.maintenance.history.statusis included in the read-only RPC allowlist alongsidemaintenance.compaction.status. Three engine unit tests (history_gc_purges_old_tombstones_and_preserves_live_rows,history_gc_retains_pre_t180_tombstones,history_gc_horizon_filters_recent_tombstones) cover the basic purge path, the pre-T180 safety retention, and thecommit_ts_ms > horizonfilter. - [x] T183: Replay bundle format + export/import tooling + deterministic replay runner. Latest:
maintenance.replay.exportnow emits a typed replay bundle containing table schema snapshots, retained row versions, filteredChangeEventmetadata, per-table checksums, and an overall manifest checksum;maintenance.replay.importmaterializes that bundle into a hidden.replay_workspaces/<workspace_id>data root; andmaintenance.replay.runreopens the imported workspace and verifies deterministic integrity by recomputing canonical table and bundle checksums. The CLI now exposesskeindb replay export,skeindb replay verify, andskeindb replay run, and the server advertises the new replay methods through the SkeinQL RPC surface. Coverage includes CLI parse tests, an engine roundtrip unit test (replay_bundle_export_import_run_roundtrip), and an HTTP/RPC integration test (t183_replay_bundle_export_import_run_roundtrip). - [x] T184: SkeinAdmin pages for time travel and replay bundles + integrity status. Latest: SkeinAdmin now exposes a dedicated
Time Travel & Replaypanel with point-in-timequery.selectexecution viaas_of, live history retention/status controls backed bymaintenance.history.*, replay bundle export/download/import flows backed bymaintenance.replay.*, session-local replay workspace tracking, and a rendered checksum/integrity summary aftermaintenance.replay.run. Asset coverage includesskeinadmin_replay_panel_exposes_time_travel_and_integrity_controls.
Phase 20 - Dedup-preserving encryption¶
Current truth: crypto primitives, envelope helpers, rotation helpers, and settings.encryption.* controls exist; data/encryption.json now persists per-database mode/active-key metadata plus the redacted audit ring, but the main engine read/write path does not yet route normal table values through EncryptedValueStore and master key bytes still remain in-memory after restart.
- [x] T190: Key management + AEAD wrappers (ENC_RANDOM, ENC_MLE_DB) (docs/CONVERGENT_ENCRYPTION.md). Latest: skeindb-core now exposes a standalone encryption baseline via encryption::DatabaseKeyManager, database-scoped encryption profiles, active key selection, and EncryptionEnvelope wrappers for ENC_RANDOM and ENC_MLE_DB. ENC_RANDOM derives a mode-specific AES-256-GCM-SIV key from the database master secret and uses randomized nonces; ENC_MLE_DB derives a deterministic content key from the database master secret plus a SHA-256 plaintext digest and returns an envelope carrying the derivation salt for later persistence work in T191. 2026-04-23 hardening: ENC_MLE_DB no longer uses a fixed zero AEAD nonce — the nonce is now HKDF-derived from the same (master_key, plaintext_digest) scope via a distinct info label, so dedup convergence is preserved while the fixed-nonce review finding is closed. Focused integration coverage lives in crates/skeindb-core/tests/encryption.rs.
- [x] T191: ValueStore encryption metadata + encrypt/decrypt paths. Latest: new EncryptedValueStore wrapper in skeindb-core (crates/skeindb-core/src/encrypted_valuestore.rs) layers put_encrypted / get_decrypted / read_envelope / reencrypt_value over an underlying ValueStore without changing the on-disk .vseg format — the EncryptionEnvelope is fully self-describing (mode code, scope id, optional key id, optional 12-byte nonce, optional 32-byte derivation salt, ciphertext) and is stored as an ordinary ValueKind::Cell blob. EncryptionEnvelope::from_stored_bytes is a strict parser that rejects trailing bytes and bad version codes. Off values bypass the envelope entirely and are stored as raw bytes. Coverage: envelope_stored_bytes_roundtrip_via_from_stored_bytes, encrypted_value_store_roundtrip_under_three_modes (ENC_OFF / ENC_RANDOM / ENC_MLE_DB).
- [x] T192: Key rotation + background re-encryption task + progress reporting. Latest: DatabaseKeyManager::rotate_active_key(db, new_key_id) returns a KeyRotationPlan { db, mode, previous_key_id, new_key_id }; DatabaseKeyManager::reencrypt_envelope(ctx, env, &mut ReencryptionProgress) rewrites a single envelope under the new active key (or returns Ok(None) when no rewrite is required) and updates inspected/rewritten/skipped-off/skipped-current counters plus the (previous_key_id, new_key_id) rotation context. EncryptedValueStore::reencrypt_value(ctx, value_id, &mut progress) writes the rewritten envelope back into the same ValueStore (old envelopes are left in place so historical reads continue to work under prior keys until a separate GC pass collects them). Coverage: rotate_active_key_then_reencrypt_envelope_progress_counters, encrypted_value_store_reencrypt_value_writes_new_envelope.
- [x] T193: settings.encryption. SkeinQL endpoints + SkeinAdmin UI + audit notes. Latest: new RPC methods settings.encryption.status, settings.encryption.set_mode, settings.encryption.register_key, settings.encryption.set_active_key, settings.encryption.rotate_key are wired through Engine (Engine.key_manager, Engine.encryption_audit) and the JSON-RPC dispatcher (crates/skeindb/src/server.rs). Master key bytes are accepted as base64 (standard / URL-safe / URL-safe-no-pad), validated to decode to exactly 32 bytes, and never persisted* — operators re-register keys after restart. Mutating endpoints append a redacted EncryptionAuditEntry to a 256-event in-memory ring exposed by settings.encryption.status (recent_audit). SkeinAdmin gains a dedicated Encryption panel (sidebar + top tab) with cards for Status, Set Mode, Register Key, Set Active Key, and Rotate Key. Read-only allowlist + mysql_known_method registry are extended with the new method names. Coverage: skeinadmin_encryption_panel_exposes_key_management_controls.
Phase 21 - Workload-guided compaction scheduler¶
- [x] T200: Telemetry signals for compaction (L0 pressure, stalls, latencies) (docs/COMPACTION_SCHEDULER.md). Latest:
stats.snapshotnow scans live.rsegsegment files for L0 pressure, records bounded soft/hard pressure events, and exposes recent point/range/write rates plus read/write latency percentiles for SkeinAdmin and future scheduler inputs. - [x] T201: Budget-based compaction scheduler + peak windows + bounds enforcement. Latest: persisted
compaction.*settings now drive a live heuristic scheduler state instats.snapshot.compaction.scheduler, including configured/effective IO+CPU budgets, peak-window scaling, task priority scoring, a pressure-driven background worker that can batch multiple removable file-pressure.rsegtasks in one tick while still rewriting canonical live-table segments as needed, and hard-pressure safe-mode write throttling for write-classified SkeinQL/HTTP requests. - [x] T202: maintenance.compaction.* endpoints (status/set_policy/pause/resume). Latest:
maintenance.compaction.status,maintenance.compaction.set_policy,maintenance.compaction.pause, andmaintenance.compaction.resumenow expose and persist runtime scheduler policy through the main RPC surface,maintenance.compaction.statusreports real workerruns, current/last task metadata, bytes rewritten/reclaimed, orphan cleanup counts, and last-error state, and SkeinAdmin now renders those worker counters directly. - [x] T203: Evaluation harness scripts + dashboards for stall rate and p99 latency. Latest:
eval/compaction_scheduler_dashboard.pynow emits a deterministic summary JSON, timeline CSV, and self-contained HTML dashboard comparing fixed leveling, fixed tiering, and workload-guided policies on stall rate and p99 latency.
Phase 22 - SQL autoparameterization and plan cache¶
- [x] T210: SQL normalization (fingerprints) + parameter extraction (docs/AUTOPARAMETERIZATION.md)
- [x] T211: Plan cache keyed by fingerprint + schema version + session flags
- [x] T212: Integrate autoparam with query coalescing, ETag caching, and telemetry
- [x] T213: SQL session variable:
SET @@skein.autoparameterize = 1+ safety rules - [x] T214: SkeinAdmin top queries grouped by fingerprint + suggested parameter schemas
Phase 23 - CDC and dependency-driven changefeeds¶
- [x] T220: WAL-to-change-event translator (table-level insert/update/delete) (docs/CDC_CHANGEFEED.md). Latest: the persisted CDC change log now records
commit_ts_mspluslsn-style sequence metadata for table-level insert/update/delete events and acts as the retained WAL-equivalent source forcdc.poll/ SSE replay. - [x] T221: cdc.subscribe_table + cdc.poll/cdc.ack/cdc.close endpoints. Latest: table subscriptions now accept optional exact
pktuple filters, inclusive single-columnpk_rangebounds,opsallowlists overinsert/update/deletesource events, changed-columncolumnsfilters that only emit events where at least one selected column value changes and project included row images down to those columns, andformat: "objects_json" | "plain_json"delivery; durablecdc_subscriptions.jsonstate is now format v8 with backward-compatible v1/v2/v3/v4/v5/v6/v7 loading. - [x] T222: Dependency-driven query changefeeds (cdc.subscribe_query) using ETag dependency sets. Latest: prepared queries can now create CDC subscriptions through
cdc.subscribe_query, andcdc.pollemits dependency-driveninvalidateevents carrying the current query ETag whenever an allowed source op touches one of the prepared query's dependency tables, including base tables reached through view dependencies, tables referenced by set-operation branches likeUNION/UNION ALL, and base tables referenced inside CTE definitions while suppressing the CTE names themselves as fake physical tables; optional exactpkfilters, inclusivepk_rangefilters, and changed-columncolumnsfilters are applied to the triggering table event before the invalidate is emitted, included triggering row images are projected to those columns, and polling recomputes that base-table change set from the stored query so legacy durable subscriptions continue invalidating correctly after restart. - [x] T223: SSE/WebSocket streaming endpoint + backpressure + reconnect semantics. Latest:
GET /api/v1/cdc/sse/{sub_id}andGET /api/v1/cdc/ws/{sub_id}stream both table CDC events and query invalidation events in the subscription's persisted event format, replay from the retained change log in bounded batches, resume fromLast-Event-IDorfrom_offset, and emitbackpressure/resnapshotcontrol messages when lag, pause, or retention state requires operator attention. - [x] T224: Retention + resnapshot protocol when WAL horizon is exceeded. Latest: bounded retained CDC history now reports
earliest_offset/latest_offset,cdc.pollreturns explicitresnapshot_requiredresponses when a consumer falls behind the retained horizon, and SSE emits aresnapshotcontrol event with the same recovery metadata. - [x] T225: SkeinAdmin CDC page + subscription management + lag visualization. Latest: SkeinAdmin now exposes a dedicated CDC panel for table/query subscribe, poll, pause, resume, ack, and close flows with session-local lag bars, runtime pressure summaries, backpressure state badges, and recent-event inspection.
Phase 24 - Website and documentation site polish¶
- [x] T230: Homepage: add Docs nav CTA, mobile hamburger menu, maturity badges on feature cards, fix broken links (architecture image, paper), consistent API endpoints
- [x] T231: Docs site homepage: sync with public site (mobile menu, Docs CTA, maturity badges, fixed quickstart endpoint, correct paper link)
- [x] T232: Docs landing (docs.html): add client-side search/filter, mobile menu, polished footer, keyword metadata on cards
- [x] T233: Footer overhaul across all pages — structured 4-column footer with Product/Documentation/Community sections
- [x] T234: Research tracks on public site converted to clickable links pointing to docs/site/research pages
Phase 25 — PostgreSQL wire protocol compatibility¶
- [x] T400: PG v3 wire protocol primitives (
pg_wire.rs) — message framing, encode/decode for StartupMessage, RowDescription, DataRow, CommandComplete, ErrorResponse, ParameterStatus, BackendKeyData, Terminate. Includes PG connection handler with simple query protocol, trust/SCRAM auth, SSL rejection, and delegation to the shared SQL execution engine. 20 unit tests + 6 integration tests. - [x] T401: SCRAM-SHA-256 authentication — RFC 5802/7677 SASL exchange with trust mode fallback. Implements full SCRAM-SHA-256 state machine in
pg_wire::scrammodule: HMAC-SHA-256, PBKDF2-HMAC-SHA256 (4096 iterations), ScramCredentials (stored_key + server_key derivation), ScramServer (client-first → server-first → client-final → server-final with proof verification). Wire helpers:write_auth_sasl,write_auth_sasl_continue,write_auth_sasl_final,parse_sasl_initial_response,parse_sasl_response. PG connection handler upgraded from cleartext to SCRAM-SHA-256 whenSKEINDB_TOKENis set; trust mode when unset. Deterministic salt derivation viapg_scram_salt_for_token. 12 new unit tests (HMAC known vector, PBKDF2, credential derivation, full exchange success/failure, GS2 header rejection, nonce missing, SASL message parsing). - [x] T402: PG session state —
pg_settingsHashMap onMySqlSessionStateinitialized with 13 PG defaults;SET key = value/SET key TO valueparsing (with LOCAL/SESSION prefix support);RESET key/RESET ALL;pg_bootstrap_setting_valuereads session overrides first;ParameterStatussent to client on SET/RESET;SHOW,current_setting(name), andcurrent_setting(name, missing_ok)now reflect session values, with the two-argument form returning NULL for unknown bootstrap probes whenmissing_ok = true;current_schemawith optional parentheses andcurrent_schemas(bool)now derive from the effectivesearch_path,current_catalogaliasescurrent_database(), and bootstrapcurrent_user/current_role/session_user/userprobes preserve the startup username. 10 unit tests. - [x] T403: PG connection handler + listener (in
server.rs) — SSL negotiation (reject with 'N'), startup message parsing, trust/SCRAM auth, ParameterStatus batch, BackendKeyData, ReadyForQuery, simple query command loop on port 5432 (configurable via--pgflag, default 5432, 0 disables) - [x] T404: PG SQL dialect parser (
pg_rewrite_sql) — double-quoted identifiers → backtick-quoted, $$dollar quoting$$ → single-quoted, :: type casts → CAST(… AS …), IS [NOT] DISTINCT FROM → null_safe_eq, FETCH FIRST n ROWS ONLY → LIMIT n, ARRAY[…] → PG array literal string. ILIKE and boolean literals were already implemented. RETURNING deferred to T405 (DML). - [x] T405: PG DML extensions —
ON CONFLICT DO NOTHING→INSERT IGNORE INTO,ON CONFLICT (...) DO UPDATE SET ... EXCLUDED.col→ON DUPLICATE KEY UPDATE ... VALUES(col)viapg_rewrite_on_conflictpost-pass;INSERT/UPDATE/DELETE ... RETURNING col1, col2, *extracted and stripped atpg_dispatch_sqllevel with follow-up SELECT using PK lookup for INSERT RETURNING;COPY table [ (col, ...) ] TO STDOUT,COPY table [ (col, ...) ] FROM STDIN, andCOPY (SELECT ...) TO STDOUTnow work over simple and extended query flows for the default text protocol plus explicitWITH (FORMAT text)/WITH (FORMAT csv)/WITH (FORMAT binary), PostgreSQL keyword-styleWITH (TEXT)/WITH (CSV)/WITH (BINARY)aliases, legacy bareWITH TEXT/WITH CSV/WITH BINARYforms, text/csvNULL '...', CSVHEADER, CSVHEADER MATCHon import, and single-byteDELIMITER,QUOTE, andESCAPEon supported CSV forms; unsupported COPY formats/options still return0A000. Expanded unit and integration coverage. - [x] T406: PG DDL — SERIAL/BIGSERIAL/SMALLSERIAL → auto_increment + i64 type, CREATE SCHEMA → CREATE DATABASE alias (with IF NOT EXISTS), CREATE INDEX CONCURRENTLY (accepted/ignored), CREATE INDEX IF NOT EXISTS, COMMENT ON (silently accepted). 9 unit tests.
- [x] T407: PG type OID mapping + encoding — bool→16, i64→20, text→25, jsonb→3802, timestamp→1114, arrays; text + binary format. Added 11 array OID constants (BOOL_ARRAY, INT4_ARRAY, INT8_ARRAY, FLOAT4_ARRAY, FLOAT8_ARRAY, TEXT_ARRAY, VARCHAR_ARRAY, DATE_ARRAY, TIMESTAMP_ARRAY, JSONB_ARRAY, UUID_ARRAY) with
array_element_oid/scalar_to_array_oidutilities. Enhanced type inference heuristic from 3 to 10 types (bool, i64, f64, date, time, datetime, uuid, json, bytes, string). Addedencode_binary_value()for PG binary wire format (BOOL, INT4, INT8, FLOAT4, FLOAT8, TEXT, VARCHAR, JSON, JSONB, UUID, BYTEA). Bind handler now accepts binary result format codes and stores them in PgPortal; Execute path applies format-aware encoding viapg_format_code_at— binary columns useencode_binary_value, text columns usepg_text_value_for_column. 29 new unit tests. - [x] T408: PG result encoding — RowDescription, DataRow, CommandComplete ("INSERT 0 1"), ErrorResponse with SQLSTATE codes. Latest: simple and extended PG queries now emit typed text-format
RowDescriptionmetadata for common numeric/text results,DataRowpayloads, PG-styleCommandCompletetags for DML/DDL, andErrorResponseSQLSTATEs end-to-end over the live listener. - [x] T409: PG system catalogs (
pg_catalog.rs) — pg_database, pg_namespace, pg_roles, pg_authid, pg_user, pg_group, pg_tablespace, pg_am, pg_description, pg_tables, pg_views, pg_indexes, pg_matviews, pg_sequences, pg_stats, pg_class, pg_attribute, pg_type, pg_index, pg_constraint, pg_proc (basic builtin metadata), pg_settings, pg_stat_activity, pg_stat_database. Latest: catalog tables are served through the shared virtual-table executor, includingpg_am(heap/btreeaccess methods aligned with currentpg_class.relamOIDs),pg_description(currently empty but exposed with PostgreSQL-correct OID/int4/text column metadata),pg_class(tables + index entries with relkindr/i),pg_attribute(column metadata with SkeinDB→PG type OID mapping),pg_index(primary key + secondary indexes withindkeyposition vectors),pg_indexes(PostgreSQL-style index definitions),pg_stat_database(single-row database counters),pg_constraint(primary keyp+ uniqueuconstraints withconkeyarrays), andpg_procmetadata for the bootstrap builtin set plus selected timestamp/UUID/array/string/aggregate helpers with aggregateprokindrows, both currentsubstringarities, both currentcurrent_settingarities, andcurrent_schemas(bool). Column-level OID overrides ensure correct wire types for catalog-specific columns (OID, bool, int4, float4/float8). - [x] T410: PG startup query handling —
SELECT version(),current_database(),current_catalog,current_schema(with optional parentheses),current_schemas(bool),current_user,current_role,session_user,user,SHOW server_version,SHOW server_version_num,SHOW standard_conforming_strings,SHOW max_identifier_length,SHOW transaction isolation level, andSELECT current_setting(...)including the two-argumentmissing_okform for the common startup/bootstrap probes used by psql/Django/Rails/SQLAlchemy-style clients - [x] T411: PG extended query protocol — Parse/Bind/Describe/Execute/Sync/Close/Flush, named statements + portals, $1/$2 parameter placeholders. Latest: the PG listener now keeps connection-local prepared statements and portals, supports text-format
Parse/Bind/ statement+portalDescribe/Execute/Close/Flush, substitutes$1/$2placeholders through the shared SQL engine, and usesSyncto recover cleanly after extended-protocol errors. - [x] T412: PG function mapping (
pg_functions.rs) — string_agg, array_agg, gen_random_uuid, to_char/to_timestamp, date_trunc, extract(epoch FROM ...), jsonb_build_object, ->>/#>> operators, || concat, ~/~* regex, ARRAY operations, unnest. Latest: shared PG function coverage now includessplit_part(text, delimiter, n)with positive and negative field indexes,textRowDescription metadata on the PG wire path, and matchingpg_catalog.pg_procbuiltin metadata. - [x] T413: PG transaction semantics — ReadyForQuery status byte (I/T/E), failed-tx-block semantics, SAVEPOINT/RELEASE/ROLLBACK TO. Latest: the PG listener now preserves session state across simple queries, emits
ReadyForQueryasI/T/E, rejects commands in aborted transaction blocks with25P02, treatsCOMMITafter an aborted transaction as a rollback, and wiresSAVEPOINT/RELEASE SAVEPOINT/ROLLBACK TO SAVEPOINTinto the existing undo-log bookkeeping. - [x] T414: PG SQLSTATE error codes — 42P01 (undefined table), 42703 (undefined column), 23505 (unique violation), 42601 (syntax error), etc. Latest: PG simple-query errors now translate shared-engine/MySQL-style failures into PostgreSQL SQLSTATEs, including undefined tables, undefined columns, unique violations, syntax-path parser errors, unsupported features, savepoint lookup failures, and failed-transaction-block errors.
- [x] T415: PG compatibility test corpus (
tests/compat/pg_corpus.sql) — mirror MySQL corpus structure for PG dialect. Latest:tests/compat/pg_corpus.sqlnow exercises the current PG baseline over the live listener, covering startup probes, shared-engine SQL, and transaction/savepoint behavior through a dedicatedpg_compat_corpus_roundtripintegration test. - [x] T416: PG unit tests — SQLSTATE error code mapping (7 tests), type OID mapping for all MySqlStmtColumnType variants, pg_text_value bool normalization + null handling, sql_type_to_desc PG type coverage (serial/boolean/real/decimal/json/blob/timestamp), sql_detect_verb for CREATE/DROP SCHEMA and COMMENT ON, pg_rewrite_sql edge cases (casts in function args, nested parens, mixed features). 19 new tests added in this pass, bringing total PG unit test count to ~40.
- [x] T417: PG integration tests — SCRAM-SHA-256 auth (success + wrong-password rejection), binary result format via extended query, type OID inference for BOOL/INT8/FLOAT8/TEXT literals. 4 new integration tests added covering the T401 SCRAM handshake end-to-end, binary DataRow encoding, and RowDescription type OID correctness.
- [x] T418: PG compatibility documentation (
docs/PG_COMPAT.md) — refreshed to the current partial baseline (startup/auth, SSL rejection, simple query, tx/savepoint semantics, text-format extended query protocol, and PG corpus coverage) and linked backlog gaps
Phase 26 - Distribution and installation¶
- [x] T419: Debian packaging metadata + signed apt repository publication pipeline. Latest:
cargo-debmetadata is wired intocrates/skeindb/Cargo.toml, and tagged releases can publish a signedaptbranch withPackages,Release,InRelease, and exported key material. - [x] T420: Homebrew tap formula + release automation. Latest: the repo now ships
Formula/skeindb.rbfor tap-based installs, supports immediateHEADinstalls from this repo, and tagged releases auto-render a stable formula from the release source tarball.
Research Agenda Extensions (Optional)¶
The repository includes a January 2026 research agenda with 20 proposals.
- Overview:
docs/RESEARCH_AGENDA.md - Task-level research tasks (T230+):
docs/RESEARCH_BACKLOG.md
These items are intentionally separated from the core phases above to keep the main build plan focused.