All Research Tracks
R20 · Systems Research

Energy-Aware Compaction Scheduling

SkeinDB's workload-guided compaction considers performance metrics but not energy consumption. For edge/embedded deployments, battery life is critical. For cloud, energy costs and carbon footprint matter. An energy-aware compaction scheduler defers work to periods of low activity, external power availability, or favorable electricity pricing — while maintaining acceptable performance bounds.

Research Proposal — Mapped to backlog in docs/RESEARCH_BACKLOG.md

🔬 What's Novel

🔧 Technical Approach

Phase 1 — Energy Modeling

Model energy per compaction operation as a function of LSM state and compaction size. Include SSD garbage collection interactions and CPU power states in the model.

Phase 2 — Constraint Specification

Define performance constraints: maximum acceptable read amplification, write amplification, and compaction backlog (space amplification). Energy optimization operates within these bounds.

Phase 3 — Scheduling Algorithm

Predict future compaction needs, estimate energy cost at different scheduling times, and optimize scheduling to minimize total energy while satisfying all performance constraints.

Phase 4 — External Signals

Integrate real-world signals: power source (battery vs. plugged), electricity grid pricing, carbon intensity of the grid, and predicted workload patterns for lookahead scheduling.

🧪 Hypotheses

H1

Compaction timing significantly impacts total energy consumption due to SSD write amplification and CPU utilization patterns.

H2

Deferring compaction to off-peak periods reduces energy costs without causing unacceptable performance degradation.

H3

Energy-aware scheduling with explicit performance constraints provides practical tradeoffs for edge and cloud deployments.

🔗 SkeinDB Integration

LSM / Compaction
Compaction Scheduler
Observability
SkeinQL RPC
External APIs

📚 Key References