Energy – AI Energy Solutions

LSTM vs Transformer: Next-Day Energy Forecasting on Smart Grid

Fictional

03/01/2026 > 21/01/2026

Context

I was commissioned by AI Energy Solutions (fictional), a smart grid company managing 50,000 households in Île-de-France, to develop a deep learning forecasting system for daily energy purchasing on the EPEX spot market.

My role:

Compare LSTM and Transformer architectures for 24-hour consumption prediction
Engineer features combining historical patterns, weather data, and calendar events
Achieve forecasting accuracy below 0.5 kW MAE to reduce annual losses (€62.5M from over/under-provisioning)

Datasets

UCI Individual Household Electric Power Consumption: real data from a household in Sceaux (Paris suburb), December 2006 to November 2010

2,049,280 minute-level observations → 34,589 hourly after aggregation
7 variables: global active/reactive power, voltage, intensity, 3 sub-meters
Target: Global Active Power (kW)

External weather data: Open-Meteo API (Sceaux) → Heating Degree, Humidity, Cold Humidity Index

Workflow

Environment: Google Colab (Python 3), TensorFlow/Keras, pandas, scikit-learn, GPU T4

Data preparation:

Hourly resampling, 1.25% missing values removed
Temporal split: Train 73% (2006-2009) / Val 15% / Test 12% (2010)
Sliding window: 336h input (2 weeks) → 24h output
MinMaxScaler normalization (fit on train only)

Feature engineering (34 features):

Lag features: 1h, 2h, 3h, 6h, 12h, 24h, 48h, 168h, 336h
Rolling stats: mean (6h, 12h, 24h, 168h, 336h), std/min/max (24h)
Cyclical encoding: hour, day of week, month (sin/cos)
Calendar: is_weekend, is_holiday, is_vacation
Weather: heating_degree (base 20°C), humidity, cold_humidity_index

Model development:

LSTM: 2 layers (64, 32 units), dropout 0.33, early stopping
Transformer: 2 encoder layers, 1 positional encoding, 4 heads, d_model=64, L2 reg, dropout 0.35

Insights

Model performance:

LSTM: MAE 0.4145 kW, overfitting gap 1.63%, training 1.8 min
Transformer: MAE 0.4086 kW (+1.4%), overfitting gap 6.62%, training 7.3 min
Both achieved target MAE < 0.5 kW

Feature importance: lag features (lag_1h: 0.71 corr) and sub-meters dominate; weather features show weak correlation (~0.15-0.20)
Limitations: both models underestimate peaks >2.5 kW; single household limits generalization.

Business Impact

LSTM recommended for production: 4x lower overfitting, simpler, faster retraining
Error reduction: 45% (MAE 0.75 → 0.4145 kW)
Estimated annual savings: ~€28M
Deployment roadmap: pilot on 5,000 households for 3 months, weekly retraining, MAE monitoring dashboard, hyperparameter optimization
Scientific contribution: demonstrated LSTM robustness over Transformer for residential energy forecasting, on simpler datasets.