🔬 What's Novel
- Integration of incremental view maintenance with dependency-tracking CDC infrastructure
- Cost model for incremental vs. full recomputation in LSM-based storage systems
- Automatic delta query derivation for SkeinQL view definitions
- Unified framework connecting caching, CDC, and materialized views through one mechanism
🔧 Technical Approach
Phase 1 — Delta Derivation
Automatic delta query derivation from view definitions. For a view V = Q(R), derive dV = dQ(R, dR) to compute incremental updates without full recomputation.
Phase 2 — Dependency Graph Extension
Extend dependency tracking to represent view-base-table relationships. Traverse the graph on data changes to identify affected views and their required delta queries.
Phase 3 — Cost-Based Switching
Cost model choosing between incremental maintenance and full recomputation based on delta size, view complexity, and staleness tolerance. Dynamic switching at runtime.
Phase 4 — Cascading Updates
Views defined on other views: topological ordering of the dependency graph, delta propagation through multiple levels, and batch optimization for cascading updates.
🧪 Hypotheses
Dependency tracking can be extended to compute delta queries for incremental view maintenance with bounded overhead.
For common view patterns, incremental maintenance is cheaper than recomputation when update batches are small relative to view size.
The dependency graph can identify cascading view updates and batch them efficiently for multi-level view hierarchies.
🔗 SkeinDB Integration
📚 Key References
- Gupta & Mumick — "Maintenance of Materialized Views: Problems, Techniques, and Applications" (1995)
- McSherry et al. — "Differential Dataflow" (2013)
- Gjengset et al. — "Noria: Dynamic, Partially-Stateful Data-Flow for High-Performance Web Applications" (2018)