🔬 What's Novel
- First formal analysis of delta chain topologies in LSM-based MVCC systems
- Provable bounds on version reconstruction cost for skip-list delta chains
- Compaction-integrated topology optimization algorithms that restructure chains opportunistically
- Empirical characterization of topology-workload interactions across OLTP, OLAP, and hybrid patterns
🔧 Technical Approach
Phase 1 — Formal Model
Build a cost model for delta chain operations: version reconstruction cost vs. chain length/topology, write cost for appending new deltas, and space overhead for each topology variant.
Phase 2 — Topology Design
Evaluate three topologies: (a) linear chains with periodic full snapshots, (b) skip-list chains with geometric snapshot spacing for O(log n) reconstruction, (c) tree-structured chains with branching at version forks.
Phase 3 — Compaction Integration
Design compaction policies that consider delta chain state: consolidate long chains, restructure topology by adding skip pointers, and prune unreachable versions.
Phase 4 — Adaptive Selection
Per-key topology selection based on access patterns. Keys with frequent historical reads get skip-list chains; keys accessed only at latest version use simple linear chains.
🧪 Hypotheses
Skip-list-style delta chains provide O(log n) version reconstruction with O(1) write amplification overhead.
Workload-aware chain topology selection outperforms any single static topology across mixed workloads.
Compaction can restructure delta chains opportunistically, amortizing reorganization cost into existing I/O.
🔗 SkeinDB Integration
📚 Key References
- Wu et al. — "An Empirical Evaluation of In-Memory Multi-Version Concurrency Control" (2017)
- Neumann et al. — "Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems" (2015)