Research Overview Tracks Agenda

Getting Started

This repository contains a single-binary SkeinDB prototype.

The focus of the scaffold is: - a portable executable (skeindb) that runs on Linux/macOS/Windows - an embedded HTTP API (SkeinQL JSON-RPC) + embedded admin UI (SkeinAdmin) - a MySQL compatibility surface (protocol + subset of SQL) intended as an adoption layer - research primitives (ETags, query patches, hash-chained WAL, ...)

Note The current executor is a small in-memory/JSON-backed prototype intended to make SkeinQL immediately usable. The full ValueID/MVCC/LSM storage engine described in the paper is a planned build-out.


1) Install

1.1 Homebrew / macOS

Use the repo as a tap:

brew tap pinkysworld/skeindb https://github.com/pinkysworld/SkeinDB
brew install --HEAD pinkysworld/skeindb/skeindb

Tagged releases update Formula/skeindb.rb, and the release workflow also attaches a codesigned macOS binary tarball for direct downloads. The stable Homebrew install path is:

brew install pinkysworld/skeindb/skeindb

1.2 apt-get

Once the signed apt branch has been published from a tagged release:

sudo curl -fsSL https://raw.githubusercontent.com/pinkysworld/SkeinDB/apt/pubkey.gpg \
  -o /usr/share/keyrings/skeindb-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/skeindb-archive-keyring.gpg] https://raw.githubusercontent.com/pinkysworld/SkeinDB/apt stable main" \
  | sudo tee /etc/apt/sources.list.d/skeindb.list >/dev/null
sudo apt-get update
sudo apt-get install skeindb

The repository, macOS codesign metadata, and apt signing metadata are generated by .github/workflows/release-packages.yml.

2) Build

Prerequisites: - Rust toolchain (stable)

From the repo root:

cargo build --release

The binary will be available at:

./target/release/skeindb

3) Run the single binary

Run with an HTTP port and MySQL port of your choice:

./target/release/skeindb serve \
  --data ./data \
  --http 8080 \
  --mysql 3306

You can now access: - SkeinAdmin: http://127.0.0.1:8080/admin - SkeinQL JSON-RPC: http://127.0.0.1:8080/api/v1/rpc - Prepared-query GET endpoint: http://127.0.0.1:8080/api/v1/q/<query_id> - MySQL listener on 127.0.0.1:3306 (handshake/auth + COM_QUERY SQL subset translation; broader MySQL compatibility is still expanding)


4) First SkeinQL commands

SkeinQL is a JSON-RPC-style protocol with its own envelope: every request sends "skeinql":"1.0", a client-chosen id, a method, and params. Responses come back as {"id":...,"ok":true,"result":{...}} or {"id":...,"ok":false,"error":{...}}.

4.1 Health check

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":1,"method":"system.version","params":{}}'

4.2 Create a database

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{
    "skeinql":"1.0",
    "id":2,
    "method":"schema.create_database",
    "params":{"db":"demo"}
  }'

4.3 Create a table

Column types are described with a TypeDesc object ({"kind":"i64"}, {"kind":"string"}, ...).

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{
    "skeinql":"1.0",
    "id":3,
    "method":"schema.create_table",
    "params":{
      "db":"demo",
      "table":"users",
      "primary_key":["id"],
      "columns":[
        {"name":"id","type":{"kind":"i64"},"nullable":false},
        {"name":"name","type":{"kind":"string"},"nullable":false},
        {"name":"updated_at","type":{"kind":"i64"},"nullable":true}
      ]
    }
  }'

4.4 Insert data

Row values are typed SkeinQL literals (Lit envelopes): {"t":"i64","v":1}, {"t":"str","v":"Ada"}, and so on.

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{
    "skeinql":"1.0",
    "id":4,
    "method":"data.insert",
    "params":{
      "into":{"db":"demo","table":"users"},
      "rows":[
        {"id":{"t":"i64","v":1},"name":{"t":"str","v":"Ada"},"updated_at":{"t":"i64","v":1}},
        {"id":{"t":"i64","v":2},"name":{"t":"str","v":"Linus"},"updated_at":{"t":"i64","v":1}}
      ]
    }
  }'

4.5 Query select

A query.select request carries a structured Query: the projection and from list live under body.select, and order_by items reference an expr.

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{
    "skeinql":"1.0",
    "id":5,
    "method":"query.select",
    "params":{
      "query":{
        "body":{
          "select":{
            "projection":[
              {"expr":{"col":"id"}},
              {"expr":{"col":"name"}}
            ],
            "from":[{"db":"demo","table":"users"}]
          }
        },
        "order_by":[{"expr":{"col":"id"},"dir":"asc"}]
      },
      "result_format":"objects_json"
    }
  }'

The response includes the rows under result.data, plus a cache etag and causality token you can feed back into later reads (see §6).

4.6 SQL compatibility endpoint + information_schema

The SQL compatibility helper endpoint is available at POST /api/v1/sql/exec. It now supports virtual metadata queries over: - information_schema.tables - information_schema.columns

Example:

curl -s http://127.0.0.1:8080/api/v1/sql/exec \
  -H 'content-type: application/json' \
  -d '{"sql":"SELECT table_schema, table_name FROM information_schema.tables ORDER BY table_schema, table_name LIMIT 10"}'

4.7 Transaction handles (SkeinQL)

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":8,"method":"tx.begin","params":{"read_only":true}}'

Commit:

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":9,"method":"tx.commit","params":{"tx_id":"tx_0000000000000001"}}'

5) Storage + dedup stats

stats.snapshot now reports live dedup effectiveness from persisted table data: - storage.dedup_ratio: logical bytes / unique bytes (higher means better dedup) - storage.logical_bytes: total bytes before dedup - storage.unique_bytes: unique bytes after content-addressed dedup - storage.duplicate_bytes: bytes saved by dedup - storage.interned_values: unique values currently interned - Table row encoding uses adaptive ValueID refs: repeated values are emitted as "$skein_ref" only when that lowers bytes on disk for the current table snapshot.

Optional persistence mode (default is segment): - --storage-mode json: write/read tables/<db>/<table>.json - --storage-mode segment: write/read tables/<db>/<table>.rseg - --storage-mode hybrid: write both formats; read prefers .rseg

Compatibility: - SKEINDB_STORAGE_MODE=dual remains accepted as an alias for hybrid.

Example:

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":5,"method":"stats.snapshot","params":{}}'

Top query fingerprints (by total time/count/latency):

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":6,"method":"stats.top_queries","params":{"limit":10,"sort_by":"total_ms"}}'

Recent slow queries:

curl -s http://127.0.0.1:8080/api/v1/rpc \
  -H 'content-type: application/json' \
  -d '{"skeinql":"1.0","id":7,"method":"stats.slow_queries","params":{"min_ms":200,"limit":20}}'

6) QueryPatch (delta updates)

See docs/QUERY_PATCH.md for a complete spec.

At a high level: 1) run query.select and keep the returned etag 2) periodically call query.patch(base_etag=...) 3) apply the returned delta to your cached list


7) MySQL compatibility

The MySQL surface is intended as an adoption layer so existing software can connect using standard MySQL drivers.

Current status is documented in: - docs/MYSQL_COMPAT.md


8) Where to go next

  • Operator knobs: docs/CONFIGURATION.md
  • Specs and research notes: docs/README.md