Peak Demand Planning Without Spreadsheet Guesswork
The Situation
Peak Demand Planning Engine is a planning tool for facility managers and energy teams that turns meter history, tariffs, forecasts, and asset limits into peak-window action plans. Teams need it when spreadsheets and manual judgment make curtailment, load shifting, battery use, or no-action decisions hard to defend before expensive demand windows.
The user is not trying to control equipment from software. The PRD explicitly excludes SCADA commands and demand-response market enrollment. The product answers a narrower operational question: given the data available right now, what action is physically feasible, what savings does it estimate, what risk does it create, and what did it reject?
That distinction matters. The shipped target covers 35,000 interval rows, 24-hour plan windows, 12-month replay evidence, and operator decisions that preserve the original recommendation. It is built for commercial facilities where the cost of being wrong is not just one bad chart. It is a facility team losing confidence in every future peak decision.
The Cost of Doing Nothing
The repository does not contain a live facility bill, so I will not claim customer savings that are not in the evidence. The cost I can defend is the operating pattern the PRD names: teams decide from interval data, tariffs, forecasts, and asset constraints by hand.
Even a modest manual review loop gets expensive. A facilities analyst spending four hours a month at a conservative £45 loaded hourly cost is roughly £2,160 a year before any avoidable demand charge is counted. That estimate is not the product's claimed ROI. It is the floor. The real exposure is a plan accepted from stale data, a battery action that violates cycle limits, or a curtailment recommendation that ignores comfort limits.
The engine replaces that ambiguity with visible action, visible rejection, and replayable evidence.
What I Built
I built a multi-tenant planning engine with a server-rendered operations UI, a Nim API, PostgreSQL 16 with TimescaleDB for interval readings, DuckDB for replay scans, and a Postgres-backed job runner for forecasts, plans, expiry checks, replay, peak-risk scans, and outbox dispatch.
The hard part was not generating a recommendation. The hard part was refusing to recommend the wrong thing. Batch 012 found a stale forecast could become a ready no-action plan because fallback logic converted invalid input into a calm answer. That got fixed by separating input failure from real physical infeasibility. A no-action plan is now a deliberate result, not a hiding place for bad data.
The UI shows selected actions, rejected actions, binding constraints, savings, comfort risk, battery-cycle impact, confidence band, and replay/export evidence. Operator decisions can accept, reject, or modify a plan without mutating the original planner output.
System Flow
Data Model
Architecture Layers
The Decision Log
| Decision | Alternative Rejected | Why |
|---|---|---|
| Nim and Mummy for the service | A large batteries-included web framework | The domain work is planner logic, validation, and replay. A small HTTP layer keeps the route path close to the service contracts. |
| PostgreSQL plus TimescaleDB | Separate relational and time-series stores | Meter readings, tenants, sites, assets, tariffs, jobs, and feedback need one tenant-scoped boundary. |
| DuckDB for backtests | A deployed analytics database | Replay is a local historical scan, and the build needed fast evidence without adding another server to self-host. |
| Postgres job rows with advisory locks | Mandatory Redis queue | Forecast, replay, outbox, and expiry jobs already share the same database state. Jobs can be claimed and retried without another required service. |
| Feature-flagged integrations | Required Notification Hub, Workflow Engine, or IoT services | CSV, JSON, and local forecasts must run the product alone. Integrations add freshness, alerts, and approval flows, but never own core planning. |
| Failed stale forecasts | Ready no-action fallback | Invalid input must not masquerade as a valid plan. The operator needs a failed plan reason, not a quiet do-nothing answer. |
Ecosystem Integration
Sensor freshness can come from the sensor telemetry engine. Alerts and high-risk approval paths stay optional routing layers rather than planning dependencies. The planner stores local events and retries optional delivery from the outbox, so a failed alert never rolls back an accepted plan. The system runs standalone with no ecosystem services available.
Results
The shipped evidence is concrete: 35,000 deterministic interval rows parse inside the 60-second import target, a 24-hour plan contract covers 96 intervals with confidence band, binding constraints, savings, risks, selected actions, and rejected actions, and the replay fixture compares planner, no-action, and threshold policies over a 12-month evidence path.
Operationally, the system moved from route shells and static claims to Docker-built Nim execution, 18 passing Playwright tests, a clean secret scan, optional integration unit coverage, and final success-criteria evidence under .pi/final-performance-audit/latest.json. The visible product now gives operators a readable plan, a reason for every rejected action, replay exports, and audit history for human decisions.
The next scaling change is not a new UI. It is replay scheduling. Many parallel 12-month backtests would need worker concurrency controls and progress tracking. The current design is right for one operator or a small multi-tenant self-hosted deployment because it proves the planning contract before expanding into portfolio optimization.