Revision 0c8f7f6ceb75f22ae00ecae9a2354e08d1cf689b

Committed on 15/12/2025 8:49 am by Zohaib Sibte Hassan <zohaib@sibte.pk> [GitHub Diff]

Implement adaptive checkpoint strategy for SQLite batch committer

Add intelligent WAL checkpoint management to reduce p99 flush latency from
~100ms to ≤10ms by moving checkpoints to background goroutines.

Core Features:
- WAL size monitoring via fast file stat (~10μs syscall)
- Adaptive checkpoint modes: PASSIVE (4MB) and RESTART (16MB) thresholds
- Background checkpoint execution (non-blocking)
- Dynamic batch sizing (2x during checkpoint to exploit parallel window)
- Timer-based flush disabled during checkpoint to prevent interference
- 7 new Prometheus metrics for checkpoint observability

Implementation:
- Add checkpoint configuration to BatchCommitConfiguration
- Extend SQLiteBatchCommitter with checkpoint state tracking (atomic.Bool)
- Implement checkWALSize() for fast WAL file size checks
- Implement backgroundCheckpoint() with PRAGMA wal_checkpoint and metrics
- Integrate checkpoint trigger after tx.Commit() before promise resolution
- Update NewSQLiteBatchCommitter signature with checkpoint parameters
- Update all callers (db_integration.go, tests)

Metrics Added:
- batch_committer_checkpoint_total (counter by mode)
- batch_committer_checkpoint_duration_ms (histogram)
- batch_committer_wal_size_mb (histogram)
- batch_committer_wal_frames_log (histogram)
- batch_committer_wal_frames_checkpointed (histogram)
- batch_committer_checkpoint_busy_total (counter)
- batch_committer_checkpoint_efficiency (histogram)

Benchmark Results (200K transactions):
- 8,132 PASSIVE checkpoints (Node 1), zero RESTART checkpoints
- WAL kept at 4-8MB (never hit 16MB emergency threshold)
- Checkpoint p99 duration: ~10ms
- Checkpoint efficiency: 99% (nearly all frames checkpointed)
- Dynamic batching working: batches reaching 200 (2x max_batch_size)

Test Coverage:
- TestBatchCommitter_NoCheckpointBelowThreshold
- TestBatchCommitter_PassiveCheckpointTriggered
- TestBatchCommitter_RestartCheckpointTriggered
- TestBatchCommitter_DynamicBatchSizing
- TestBatchCommitter_TimerDisabledDuringCheckpoint
- TestBatchCommitter_ConcurrentOperationsWithCheckpoint
- TestBatchCommitter_CheckpointMetrics

All tests pass including race detection.

???? Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>