R02 — Adaptive Row-Column Hybrid Execution

Research Proposal — Mapped to backlog in docs/RESEARCH_BACKLOG.md

🔬 What's Novel

Formal cost model for column snapshot materialization in LSM-based systems
Integration of columnar storage with fine-grained dependency tracking for cache invalidation
Online adaptive materialization decisions without offline workload analysis
Empirical study of row-column tradeoffs using SkeinDB's unified storage model

🔧 Technical Approach

Phase 1 — Cost Model

Formalize the cost of column snapshot creation (scan cost, storage overhead) vs. benefit (reduced I/O for projections). Include compaction interaction costs to model the full lifecycle.

Phase 2 — Pattern Detection

Query pattern analysis identifying frequently accessed column subsets, scan-heavy queries benefiting from columnar format, and temporal access patterns for adaptive thresholds.

Phase 3 — Dependency Integration

Extend dependency tracking to column granularity. Mark affected column snapshots for incremental refresh or invalidation on row-level updates, avoiding full recomputation.

Phase 4 — Adaptive Materialization

Continuous controller evaluating materialization decisions based on recent query patterns, resource availability, and cost-benefit thresholds. No offline workload analysis required.

🧪 Hypotheses

Query pattern analysis can predict which column projections offer the highest benefit-to-cost ratio for materialization.

Dependency tracking can maintain column snapshot consistency with minimal overhead compared to full invalidation.

Adaptive materialization decisions can be made online without offline workload analysis, adapting to changing access patterns.

🔗 SkeinDB Integration

ValueID Store

Dependency Tracking

LSM / Compaction

SkeinQL RPC

Column Snapshots

📚 Key References

Arulraj et al. — "Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads" (2016)
Pavlo et al. — "Self-Driving Database Management Systems" (2017)

← R01 — Learned Index Structures R03 — Delta-Chain Topology →