MySQL Compatibility¶
SkeinDB is designed to be adoptable by existing applications that already speak MySQL.
Honest scope. SkeinDB does not claim "100% MySQL compatibility" and is not a drop-in replacement for MySQL Server in every workload. "MySQL" is a trademark of Oracle and SkeinDB is not affiliated with or endorsed by Oracle. What we do claim is documented coverage of a regression corpus (
tests/compat/corpus.sql, currently 1658 lines / ~700+ statements covering WordPress-class workloads,information_schemaintrospection,SET NAMES,SHOW VARIABLES, prepared-statement protocol, and the SQL surface enumerated below) that runs on every commit viacluster_rpc::compat_corpus_statements.
The compatibility layer is intentionally treated as an adoption surface: - it makes it possible to point existing apps/tools at SkeinDB with minimal change - while SkeinDB retains a separate, research-friendly control plane (SkeinQL)
This document explains the intended scope and the current scaffold status.
What “MySQL compatibility” means here¶
There are two related layers:
1) MySQL protocol compatibility¶
Support the MySQL client/server wire protocol so standard clients and drivers can connect.
Examples:
- command-line clients (mysql)
- application drivers (JDBC, ODBC, Node mysql2, PHP mysqli, ...)
- tooling (migration tools, admin GUIs)
2) SQL dialect compatibility¶
Support a subset of MySQL SQL syntax and semantics.
Even with protocol support, SQL dialect mismatches can break apps.
Current status in this repository¶
- The CLI
--mysqllistener now supports a minimal MySQL wire handshake withmysql_native_passwordauth exchange. - The listener supports a
COM_QUERYtranslation subset throughsql.execfor: SELECT(literal-only, single-table, and simpleINNER JOIN/LEFT JOIN/RIGHT JOIN/CROSS JOIN/NATURAL JOIN/FULL OUTER JOINqueries, including basic left-associative multi-join chains, top-level comma-separatedFROMlists, baselineJOIN ... USING (...)rewrites for simple base-table joins with explicit/qualified projections, derived tables / FROM subqueries (SELECT * FROM (SELECT ...) AS alias), common table expressions (WITH name AS (SELECT ...) SELECT * FROM name), projection aliases with or withoutAS, and wildcard projections such asSELECT */table.*/db.table.*over those supported join shapes)SHOW(DATABASES,TABLES,COLUMNS)USECREATE DATABASE,CREATE TABLE,CREATE [UNIQUE] INDEX,DROP INDEX,DROP TABLEALTER TABLE ... ADD COLUMN(including compatibility handling forAFTER/FIRSTposition clauses),ALTER TABLE ... MODIFY [COLUMN],ALTER TABLE ... CHANGE [COLUMN],ALTER TABLE ... RENAME COLUMN,ALTER TABLE ... RENAME [KEY|INDEX] ... TO ...,ALTER TABLE ... RENAME [TO|AS] [db.]new_table,ALTER TABLE ... DROP COLUMN,ALTER TABLE ... ADD [UNIQUE] KEY, andALTER TABLE ... DROP [KEY|INDEX]INSERT,INSERT IGNORE,REPLACE,UPDATE,DELETEINSERT ... SELECTINSERT ... ON DUPLICATE KEY UPDATE(declared key-aware compatibility routing)- Multi-table
DELETE:DELETE t1 FROM t1 JOIN t2 ON ... WHERE ... - Multi-table
UPDATE:UPDATE t1 JOIN t2 ON ... SET t1.col = ... WHERE ...(executes via SELECT + affected-row counting) TRUNCATE TABLE(rewrite toDELETE FROM)DROP DATABASE/DROP SCHEMA(withIF EXISTS)RENAME TABLE ... TO ...(rewrite toALTER TABLE RENAME)CREATE VIEW/DROP VIEW(no-op stubs)UNION/UNION ALLSQL_CALC_FOUND_ROWSandFOUND_ROWS()- The MySQL wire layer also now includes a basic prepared-statement baseline:
COM_STMT_PREPARE,COM_STMT_EXECUTE, andCOM_STMT_CLOSECOM_STMT_SEND_LONG_DATA,COM_STMT_RESET, and baselineCOM_STMT_FETCH?placeholders are rebound into the same SQL-translation path asCOM_QUERY- simple prepared
SELECTs now advertise prepare-time result column counts and MySQL-style column definitions (including single-tableSELECT *, simple join projections across the supportedINNER/LEFT/RIGHT/CROSSjoin subset and top-level comma-separatedFROMlists, baselineJOIN ... USING (...)explicit-projection queries over simple base-table joins, projection aliases with or withoutAS, supported scalar-expression projections such as arithmetic expressions, broader scalar/date-time functions includingFIND_IN_SET/ISNULL,DATE_FORMAT/FROM_UNIXTIME,DATEDIFF/TIMESTAMPDIFF,WEEKDAY/DAYOFWEEK/DAYOFYEAR,MONTHNAME/DAYNAME,QUARTER,LAST_DAY,EXTRACT(<unit> FROM ...), and baseline interval arithmetic throughDATE_ADD/DATE_SUB/TIMESTAMPADD, supported subquery-compatSELECTs whoseWHEREclauses are rewritten through the compatibility layer, including the currentIN/EXISTS/ simple scalar-compare subset, plusCASE/CAST, and simple aggregate / grouped-aggregate compatibility queries) - prepare-time COM_STMT metadata now also stays aligned for supported scalar subqueries in
SELECTlists, including correlated projection subqueries and embedded arithmetic such assalary - (SELECT AVG(...)) - prepared wildcard projections over the supported join subset now execute through the same wildcard-expansion path used by
COM_QUERY, so prepare-time wildcard metadata and execute-time rows stay aligned forSELECT */table.*/db.table.*join shapes - prepared
SELECTresponses are returned over the binary row protocol - read-only server-side cursor execution now works for prepared result sets
- deeper prepare-time metadata parity (more complex joins/subqueries, richer exact types, stricter cursor/driver semantics) remains follow-on work
- The MySQL wire layer also now includes wire protocol support for
COM_INIT_DB(0x02) andCOM_STATISTICS(0x09). - The MySQL wire layer also ships compatibility shims for the checked-in corpus in
tests/compat/corpus.sql, including: BETWEEN/NOT BETWEENinWHEREclausesREGEXP/RLIKE/NOT REGEXPpattern-matching operator<=>(NULL-safe equality operator)SELECT ... FOR UPDATE/FOR SHARE/LOCK IN SHARE MODE(locking hints stripped for compatibility)SELECT VERSION()andSELECT DATABASE()- WordPress-style/bootstrap session queries such as
SET NAMES,SET CHARACTER SET,SET SESSION sql_mode,SET SQL_AUTO_IS_NULL,SET SESSION sql_notes,SET time_zone, transaction-isolation/read-onlySETforms, andSELECT @@sql_mode/@@transaction_isolation/@@sql_auto_is_null - literal session-variable selects with MySQL-style
LIMITforms (LIMIT n,LIMIT offset,n,LIMIT n OFFSET offset) for bootstrap probes such asSELECT @@version_comment SHOW FULL TABLES,SHOW TABLE STATUS,SHOW [FULL] COLUMNS,SHOW INDEX,SHOW CREATE TABLESHOW CREATE DATABASESHOW WARNINGS/SHOW ERRORS(empty result compat)SHOW PROCESSLIST/SHOW FULL PROCESSLIST(single-row stub)SHOW TRIGGERS/SHOW EVENTS/SHOW PROCEDURE STATUS/SHOW FUNCTION STATUS(empty result stubs)SHOW PLUGINS(single-row SkeinDB stub)SHOW PROFILES(empty result stub)SHOW ENGINES/SHOW STORAGE ENGINES(SkeinDB engine row)DESCRIBE/SHOW KEYS- aggregate result emulation for
COUNT(*),COUNT(col),COUNT(DISTINCT col),SUM(col),MIN(col),MAX(col),AVG(col),GROUP_CONCAT(),BIT_AND(),BIT_OR(), andBIT_XOR()on both single-result aggregate queries, single-columnGROUP BYqueries (including compatibility-level single-resultHAVINGwithoutGROUP BY, groupedHAVINGfor simple alias/aggregate-expression top-levelANDpredicates, and compatibility-levelORDER BY/LIMIT/OFFSEThandling), multi-columnGROUP BYqueries with multiple group columns and multiple aggregate expressions, plus multi-count aggregate queries such as WordPress Users-screen role counts built fromCOUNT(NULLIF(<predicate>, false)) - window functions:
ROW_NUMBER(),RANK(), andDENSE_RANK()withOVER(PARTITION BY ... ORDER BY ... [DESC])clauses inSELECTprojections - user variables:
SET @varname = valueandSELECT @varnamewith per-session state - compatibility rewrite for WordPress-style non-aggregate
GROUP BYde-dup queries when grouped columns map to the full projected column set, including wildcard projections after schema expansion, simpleHAVINGpredicates over grouped projected columns / aliases, andSQL_CALC_FOUND_ROWS/FOUND_ROWS()flows - wildcard projection execution for supported join shapes, including
SELECT *, qualifiedtable.*/db.table.*, andSQL_CALC_FOUND_ROWS/FOUND_ROWS()flows over those wildcard join result sets - explicit
CROSS JOINand top-level comma-separatedFROMlists for common compatibility-style join queries NATURAL JOINfor automatic column-name-based join conditionsFULL OUTER JOINwith full execution support (not just parsed)- derived tables / FROM subqueries (
SELECT * FROM (SELECT ...) AS alias) - common table expressions (CTEs):
WITH name AS (SELECT ...) SELECT * FROM name - MySQL-style column
DEFAULThandling forCREATE TABLE/ALTER TABLE ... ADD COLUMN, includingSHOW FULL COLUMNS/SHOW CREATE TABLEoutput KEY/UNIQUE KEYmetadata from MySQL DDL (includingALTER TABLE ... ADD [UNIQUE] KEYandALTER TABLE ... RENAME [KEY|INDEX]), surfaced throughSHOW INDEX/SHOW CREATE TABLEDISTINCT,IN (...)/NOT IN (...),LIKE/NOT LIKE,IS NULL/IS NOT NULL, and parenthesizedAND/ORpredicate trees for common WordPress query shapes, withNULLvalues now treated as SQL-style unknowns in comparison /IN/LIKEpredicates- broader MySQL scalar-function coverage for
LOWER/UPPER,LENGTH/CHAR_LENGTH,TRIM/LTRIM/RTRIM,LEFT/RIGHT,SUBSTRING/SUBSTR,REPLACE,NULLIF,IF,LOCATE,INSTR,FIND_IN_SET,ISNULL,ABS,ROUND,FLOOR,CEIL/CEILING,MOD,LEAST,GREATEST,COALESCE,IFNULL,CONCAT,CONCAT_WS,REPEAT,REVERSE,LPAD,RPAD,SPACE,HEX,UNHEX,FORMAT,SIGN,SQRT,POW/POWER,TRUNCATE,LOG/LN,LOG2,LOG10,EXP,PI,RAND,UUID,SLEEP,BENCHMARK,FIELD,ELT,INET_ATON,INET_NTOA,BIN,OCT,CONV,CRC32,MD5,SHA1/SHA,SHA2,INSERT(string),MAKE_SET,EXPORT_SET,QUOTE,SUBSTRING_INDEX,ASCII,ORD,CHAR,STRCMP,BIT_LENGTH,OCTET_LENGTH,REGEXP_REPLACE,REGEXP_SUBSTR,TO_BASE64,FROM_BASE64,DEGREES,RADIANS,PERIOD_ADD,PERIOD_DIFF,MAKEDATE, andMAKETIMEin translated projections and simple predicates - JSON function coverage:
JSON_EXTRACT,JSON_UNQUOTE,JSON_OBJECT,JSON_ARRAY,JSON_CONTAINS,JSON_LENGTH,JSON_TYPE,JSON_VALID,JSON_SET,JSON_KEYS,JSON_MERGE_PRESERVE,JSON_REMOVE,JSON_REPLACE, andJSON_INSERT - baseline translated date/time scalar functions for
DATE,YEAR,MONTH,DAY/DAYOFMONTH,WEEKDAY,DAYOFWEEK,DAYOFYEAR,MONTHNAME,DAYNAME,QUARTER,LAST_DAY,EXTRACT(<unit> FROM ...),HOUR,MINUTE,SECOND,UNIX_TIMESTAMP,DATE_FORMAT,FROM_UNIXTIME,DATEDIFF,TIMESTAMPDIFF, baseline interval arithmetic throughDATE_ADD/DATE_SUB(withINTERVAL <expr> <unit>syntax) andTIMESTAMPADD,NOW/CURRENT_TIMESTAMP/LOCALTIMESTAMP,CURDATE/CURRENT_DATE/CURTIME/CURRENT_TIME/LOCALTIME,STR_TO_DATE,WEEK,YEARWEEK,CONVERT_TZ,UTC_TIMESTAMP,UTC_DATE,UTC_TIME,SYSDATE,ADDTIME,SUBTIME,TIME_TO_SEC, andSEC_TO_TIME - baseline translated
CASE ... WHEN ... THEN ... ELSE ... ENDandCAST(... AS ...)expression support in projections, simple predicates, and scalar-expressionORDER BYclauses for the current translated subset - baseline translated arithmetic expression support for
+,-,*,/, and%in projections, simple predicates, and scalar-expressionORDER BYclauses, including numeric ordering/filtering patterns such ascol + 0 - secondary-index-backed duplicate-key enforcement for
PRIMARY KEY/UNIQUE KEYwrites (includingPRIMARY KEY-changingUPDATEs), declared PK /UNIQUE KEYconflict routing forREPLACEandON DUPLICATE KEY UPDATE, duplicate-row rejection when creating a MySQL compatibilityUNIQUE INDEXover existing data, durable per-table secondary-index cache metadata (tables/<db>/<table>.sidx.json) that reloads on reopen, and MySQL-style duplicate-key wire errors (1062/23000) SHOW VARIABLES,SHOW STATUS,SHOW CHARACTER SET,SHOW COLLATION,SHOW ENGINES,SHOW GRANTS(including compatibility values for WordPress/common bootstrap variables such assql_auto_is_null, charset/collation variables,time_zone, andtransaction_isolation; unfiltered and scoped forms likeSHOW [SESSION|GLOBAL] VARIABLES; simpleWHERE Variable_name .../WHERE Charset ...filters; plus wildcard patterns likeSHOW VARIABLES LIKE 'character_set_%')information_schema.schematavirtual tableinformation_schema.tablesvirtual table with real table metadata andTABLE_TYPE='VIEW'rows for native materialized viewsinformation_schema.columnsvirtual table with column metadata, ordinal positions, nullable/PK info, MySQL-styleDATA_TYPE/COLUMN_TYPE, character length, numeric precision/scale, charset/collation,EXTRA,COLUMN_COMMENT,PRIVILEGES, andGENERATION_EXPRESSIONfields for ORM/bootstrap probesinformation_schema.statisticsvirtual table with real index data from PK + secondary indexesinformation_schema.key_column_usagevirtual table (PK + UNIQUE key columns)information_schema.table_constraintsvirtual table (PK + UNIQUE constraints)information_schema.character_setsvirtual tableinformation_schema.collationsvirtual tableinformation_schema.enginesvirtual tableinformation_schema.pluginsvirtual table (SkeinDB storage-engine plugin row)information_schema.routinesvirtual table (empty stub)information_schema.triggersvirtual table (empty stub)information_schema.viewsvirtual table with native view definitions and MySQL-style updatability/check-option metadatainformation_schema.eventsvirtual table (empty scheduler-event stub)information_schema.partitionsvirtual table (one non-partitioned metadata row per table)information_schema.referential_constraintsvirtual table (empty FK stub)information_schema.check_constraintsvirtual table (empty CHECK constraint stub)information_schema.parametersvirtual table (empty routine-parameter stub)information_schema.tablespacesvirtual table (single default SkeinDB tablespace row)information_schema.processlistvirtual table (single-row stub)information_schema.user_privilegesvirtual table (single-row stub)information_schema.table_privilegesvirtual table (per-table privilege rows for common ORM probes)- limited subquery compatibility rewrites for common adoption paths:
... WHERE <col> [NOT] IN (SELECT ...),... WHERE [NOT] EXISTS (SELECT ...), and simple scalar comparison predicates such as... WHERE <expr> = (SELECT ...), including parenthesized top-levelAND/ORboolean trees that mix one or more of those predicates with translated non-subquery filters, negatedNOT (...)wrappers when the resulting boolean tree can be pushed back into the translated comparison subset, recursive execution when nested inner subqueries also fit the current compatibility path, plus simple correlated rewrites for base-table subqueries whose outer references are top-level equality clauses (including equality-based correlatedINand multi-columnEXISTSmembership rewrites; scalar subqueries currently require a single projected column and at most one row) USER()/CURRENT_USER()/SESSION_USER()/SYSTEM_USER()session functionsLAST_INSERT_ID()function with session trackingCONNECTION_ID()functionEXPLAINwith real table name extraction from inner query (SELECT/UPDATE/DELETE/INSERT)DOstatementSAVEPOINT/RELEASE SAVEPOINT/ROLLBACK TO SAVEPOINT(no-op stubs)SET GLOBALforms (no-op compat)SET autocommit(including qualified/session forms),BEGIN,COMMIT,ROLLBACK, andSTART TRANSACTIONFLUSH/ANALYZE/OPTIMIZE/CHECK/REPAIR TABLE(no-op compat)KILLcommand (no-op)- compatibility no-op handling for
LOCK TABLES/UNLOCK TABLES crates/skeindb/tests/cluster_rpc.rsnow executes the entire compatibility corpus end-to-end over the MySQL port, so the corpus is enforced as a runtime baseline instead of only documented, including scalar-function, arithmetic-expression, extended date/time-function/formatting coverage (now includingDATEDIFF/TIMESTAMPDIFF,WEEKDAY/DAYOFWEEK/DAYOFYEAR,MONTHNAME/DAYNAME,QUARTER,LAST_DAY,EXTRACT(<unit> FROM ...), plus baseline interval arithmetic throughDATE_ADD/DATE_SUB/TIMESTAMPADD), single-result aggregateHAVINGwithoutGROUP BY, grouped-aggregateHAVING, non-aggregate wildcardGROUP BYde-dup coverage,CASE/CAST, expression-ordering, projection aliases with or withoutAS, wildcard join projection coverage (SELECT *, qualifiedtable.*, and schema-qualifieddb.table.*), explicitCROSS JOIN, comma-list join coverage, baselineJOIN ... USING (...)explicit-projection coverage, parenthesized boolean-tree correlated/nested-subquery coverage, simple scalar-subquery comparison coverage, simple correlatedEXISTScoverage, duplicate-check coverage for creating MySQL compatibility unique indexes,BETWEEN/NOT BETWEEN,COUNT(DISTINCT col),GROUP_CONCAT(),INSERT ... SELECT,UNION/UNION ALL,TRUNCATE TABLE,DROP DATABASE,RENAME TABLE, session functions (USER(),LAST_INSERT_ID(),CONNECTION_ID()),EXPLAINstub,DOstatement,SAVEPOINTstubs, locking hint stripping,information_schema.schemata/information_schema.statistics, expandedSHOWcommands, WordPress Site Health-styleinformation_schema.TABLESstorage summaries, WordPress Users-screen role counts built fromCOUNT(NULLIF(<predicate>, false)), and the full set of additional scalar and date/time functions listed above, plus derived tables (FROM subqueries), CTEs (WITH...AS),REGEXP/RLIKE/NOT REGEXP,<=>(NULL-safe equality),NATURAL JOIN,FULL OUTER JOIN, multi-tableDELETE, multi-tableUPDATE(executed), 14 JSON functions (JSON_EXTRACT/JSON_UNQUOTE/JSON_OBJECT/JSON_ARRAY/JSON_CONTAINS/JSON_LENGTH/JSON_TYPE/JSON_VALID/JSON_SET/JSON_KEYS/JSON_MERGE_PRESERVE/JSON_REMOVE/JSON_REPLACE/JSON_INSERT), additional scalar functions (FIELD/ELT,INET_ATON/INET_NTOA,BIN/OCT/CONV,CRC32,MD5,SHA1/SHA,SHA2,INSERT/MAKE_SET/EXPORT_SET/QUOTE), 22information_schemavirtual tables (tables/columns/schemata/statistics/key_column_usage/table_constraints/character_sets/collations/engines/plugins/routines/triggers/views/events/partitions/referential_constraints/check_constraints/parameters/tablespaces/processlist/user_privileges/table_privileges),SHOW ENGINES,GROUP_CONCATwithSEPARATOR/DISTINCT/ORDER BYstripping, andEXPLAINwith real table extraction. The corpus has expanded to 1657 lines with about 678 semicolon-terminated SQL statements covering all previously TODO-gated features (IF/NULLIF, EXISTS/NOT EXISTS, REGEXP, CAST, COUNT DISTINCT, window functions, CTEs, RIGHT/CROSS/NATURAL/FULL OUTER JOIN, derived tables, IN/scalar subqueries, multi-table DELETE/UPDATE, LOCK/UNLOCK, INFORMATION_SCHEMA), plus new coverage for INSERT...SELECT, DO, EXPLAIN, SAVEPOINT, CREATE/DROP VIEW, system variables (@@version, @@autocommit, etc.), session functions (CURRENT_USER, CONNECTION_ID, FOUND_ROWS), locking hints (FOR UPDATE/FOR SHARE), extended SHOW commands, SET GLOBAL, and additional scalar functions (RAND, UUID, SLEEP, BENCHMARK, ISNULL, FIND_IN_SET, ADDTIME, SUBTIME, CONVERT_TZ, WEEKDAY, MONTHNAME, DAYNAME, QUARTER, SYSDATE, etc.). Separate MySQL-wire regressions now also pin real WordPress installer/admin seed queries: escaped single-rowINSERTs, trailing/* LOCK */comments, serialized_site_transient_wp_theme_files_patterns-*option upserts,information_schema.TABLESprobes usingTABLE_NAME IN (...)plusENGINE = ...,information_schema.table_privilegesprobes, and the WordPress admin Users/Site Health query shapes exercised in the live smoke.- The primary working interface in the scaffold is SkeinQL JSON-RPC over HTTP.
- The SQL story is split:
- SkeinQL includes a full query/expression layer intended to cover common SQL patterns.
- A shipped but intentionally narrow SQL→SkeinQL translation layer now provides the current MySQL-ish subset; broader parity work is still ongoing.
If you want “drop-in MySQL for real apps”, the next concrete milestones are:
1) broaden COM_QUERY parity beyond the current WordPress-class baseline with stricter parity tests (deeper correlated/nested subqueries beyond the current recursive IN / EXISTS / simple scalar-compare compatibility path, broader function/date/time/cast parity beyond the current scalar + date/time baseline, and broader ALTER TABLE variants beyond the current ADD/MODIFY/CHANGE/RENAME COLUMN/RENAME [KEY|INDEX]/RENAME TO/DROP COLUMN plus index metadata surface)
2) deepen prepared-statement parity (complex-query metadata, stricter driver/cursor semantics, fuller protocol coverage) and optimizer parity for production drivers, even though prepare-time metadata now also covers supported scalar-expression projections including baseline arithmetic, broader scalar/date-time functions including FIND_IN_SET / ISNULL, DATE_FORMAT / FROM_UNIXTIME, DATEDIFF / TIMESTAMPDIFF, WEEKDAY / DAYOFWEEK / DAYOFYEAR, MONTHNAME / DAYNAME, QUARTER, LAST_DAY, EXTRACT(<unit> FROM ...), and baseline interval arithmetic through DATE_ADD / DATE_SUB / TIMESTAMPADD, supported subquery-compat SELECTs whose WHERE clauses rewrite cleanly, the current projection-level scalar-subquery slice, embedded scalar-subquery arithmetic, plus CASE / CAST and simple aggregate / grouped-aggregate compatibility shims
Compatibility telemetry and migration hints¶
A central research idea in SkeinDB is to instrument the compatibility layer:
- track which SQL features a workload actually uses
- emit automatic migration hints (MySQL → SkeinQL)
- provide “compatibility coverage” metrics (what percentage of a real workload would run)
See:
- docs/COMPAT_TELEMETRY.md
- the paper section “Compatibility telemetry + migration hints”
Suggested implementation plan¶
A staged plan that keeps SkeinDB single-binary:
1) Protocol skeleton
- handle handshake + capability flags
- accept a connection and respond to a fixed SELECT 1
2) Minimal SQL subset
- implement SELECT ... FROM table WHERE pk = ?
- implement parameter binding
3) Translation to SkeinQL
- parse SQL into an AST
- translate to SkeinQL Query + Expr
4) Coverage + correctness - expand supported functions, joins, aggregation - add golden tests (same query → same result) against MySQL
5) Performance work - prepared statement caching - vectorized execution path for common scans