Forecast Reconciliation:
 A Review
Anastasios Panagiotelis
16th August, 2023
1

Joint work with...Rob Hyndman
George Athanasopoulos
Nikos Kourentzes
Puwasala Gamakumara
Mohamaed Affan
Han Li

Hong Li
Yang Lu
Florian Eckert
Fotios Petropoulos
Jooyoung Jeon
Bohan Zhang
Yanfei Kang
Feng Li

2

Motivation3

Hierarchical Time SeriesPredictions of multiple variables needed.
4

Hierarchical Time SeriesPredictions of multiple variables needed.
Variables follow linear constraints.
4

Hierarchical Time SeriesPredictions of multiple variables needed.
Variables follow linear constraints.
Forecast store level sales and aggregates.
4

Temporal
5

Grouped Time Series

ExamplesTourism data grouped by:Region
Purpose of travel

Prison population data grouped by:State
Gender

7

Definition of HierarchiesAlternative definitions of 'Hierarchical Time Series':Any collection of nn variables with qq linear constraints.
Any collection of nn variables with support on a mm-dimensional linear subspace n=m+qn=m+q.

These are the same. Notably They do not need to involve aggregation
They do not even need to be time series

8

Electricity ExampleTotal daily electricity in Australian NEM
9

Electricity ExampleTotal daily electricity in Australian NEMRenewable
Non-renewable

9

Electricity ExampleTotal daily electricity in Australian NEMRenewable
Non-renewable

Renewable can be broken down 
9

Electricity ExampleTotal daily electricity in Australian NEMRenewable
Non-renewable

Renewable can be broken down Solar
Wind
etc.

9

Electricity ExampleTotal daily electricity in Australian NEMRenewable
Non-renewable

Renewable can be broken down Solar
Wind
etc.

Solar can be broken down into
9

Electricity ExampleTotal daily electricity in Australian NEMRenewable
Non-renewable

Renewable can be broken down Solar
Wind
etc.

Solar can be broken down intoSolar rooftop
Solar utility

9

Electricity Example

Total daily electricity in Australian NEM
- Renewable
- Non-renewable
Renewable can be broken down
- Solar
- Wind
- etc.
Solar can be broken down into
- Solar rooftop
- Solar utility
Data sourced from Open NEM.

Electricity Data (link)

Main takeawaysData have different characteristics regarding
11

Main takeawaysData have different characteristics regarding  Trends
Seasonality
Spikes
Signal to noise ratio

11

Main takeawaysData have different characteristics regarding  Trends
Seasonality
Spikes
Signal to noise ratio

Hard to come up with a single multivariate model. 
11

Main takeawaysData have different characteristics regarding  Trends
Seasonality
Spikes
Signal to noise ratio

Hard to come up with a single multivariate model. 
Even harder to do so while accounting for constraints.
11

What is reconciliation?12

Traditional approachesSingle level approaches
13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).

13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).  
Top Down (Gross and Sohl, 1990).

13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).  
Top Down (Gross and Sohl, 1990).  
Middle Out

13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).  
Top Down (Gross and Sohl, 1990).  
Middle Out

Top down do not exploit information at bottom levels.
13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).  
Top Down (Gross and Sohl, 1990).  
Middle Out

Top down do not exploit information at bottom levels.
Bottom up can suffer from the noisiness of bottom level series.
13

Traditional approachesSingle level approaches  Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).  
Top Down (Gross and Sohl, 1990).  
Middle Out

Top down do not exploit information at bottom levels.
Bottom up can suffer from the noisiness of bottom level series.
These approaches do not work for more general constraints.
13

What is reconciliation?Traditional approaches only use forecasts at a single level.
14

What is reconciliation?Traditional approaches only use forecasts at a single level.
Question: Why not produce forecasts for all series?
14

What is reconciliation?Traditional approaches only use forecasts at a single level.
Question: Why not produce forecasts for all series?  Answer: They may not respect constraints

14

What is reconciliation?Traditional approaches only use forecasts at a single level.
Question: Why not produce forecasts for all series?  Answer: They may not respect constraints

Solution: Adjust forecasts ex post.
14

What is reconciliation?Traditional approaches only use forecasts at a single level.
Question: Why not produce forecasts for all series?  Answer: They may not respect constraints

Solution: Adjust forecasts ex post.
This is called forecast reconciliation.
14

In the beginningSome very early examples involving national accounts dataStone, Champernowne, and Meade (1942)
Byron (1978)

15

In the beginningSome very early examples involving national accounts dataStone, Champernowne, and Meade (1942)
Byron (1978)

Literature becomes more focused on forecasting with Hyndman, Ahmed, Athanasopoulos, and Shang (2011)
15

In the beginningSome very early examples involving national accounts dataStone, Champernowne, and Meade (1942)
Byron (1978)

Literature becomes more focused on forecasting with Hyndman, Ahmed, Athanasopoulos, and Shang (2011)
How did they think about the problem?
15

The regression interpretation

Consider the $n$ -vector of initial (so-called base) forecasts denoted $\hat{y}$ .
Let $β$ be an $m$ -vector vector of 'true' bottom level forecasts.
Consider a regression

$\hat{y} = S β + ϵ$

What is $S$ ?

The summing matrix
17

The summing matrix

$S = (\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix})$

The solution

Can get an 'estimate' of $β$ via least squares

$\hat{β} = {(S^{'} S)}^{- 1} S^{'} \hat{y}$

The solution

Can get an 'estimate' of $β$ via least squares

$\hat{β} = {(S^{'} S)}^{- 1} S^{'} \hat{y}$

Find a full set of coherent forecasts as

$\tilde{y} = S {(S^{'} S)}^{- 1} S^{'} \hat{y}$

The solution

Can get an 'estimate' of $β$ via least squares

$\hat{β} = {(S^{'} S)}^{- 1} S^{'} \hat{y}$

Find a full set of coherent forecasts as

$\tilde{y} = S {(S^{'} S)}^{- 1} S^{'} \hat{y}$

Here $\tilde{y}$ is the OLS reconciled forecast.

Other reconciliation methods

Other reconciliation methods take the form

$\tilde{y} = S {(S^{'} W S)}^{- 1} S^{'} W \hat{y}$

Different choices of $W$
- Diagonal (Athanasopoulos, Hyndman, Kourentzes, and Petropoulos, 2017)
- Error covariance or 'MinT' (Wickramasuriya, Athanasopoulos, and Hyndman, 2019)

Geometric IntepretationBase forecasts ^yy^ lie somewhere in RnRn
Realisation yy lies on a linear subspace ss that is spanned by the columns of SS.
Forecasts need to be reconciled via a mapping ~y=ψ(^y)y~=ψ(y^).

21

Geometric Intepretation

Base forecasts $\hat{y}$ lie somewhere in $R^{n}$
Realisation $y$ lies on a linear subspace $s$ that is spanned by the columns of $S$ .
Forecasts need to be reconciled via a mapping $\tilde{y} = ψ (\hat{y})$ .

Why it works?

Consider a loss function

$L_{W} (y, \overset{˘}{y}) = (y - \overset{˘}{y})^{'} W (y - \overset{˘}{y})$

Why it works?

Consider a loss function

$L_{W} (y, \overset{˘}{y}) = (y - \overset{˘}{y})^{'} W (y - \overset{˘}{y})$

Can prove that for $\tilde{y} = S {(S^{'} W S)}^{- 1} S^{'} W \hat{y}$

$L_{W} (y, \tilde{y}) \leq L_{W} (y, \hat{y})$

Proof in Panagiotelis, Athanasopoulos, Gamakumara, and Hyndman (2021)

Intuition

Optimality

Assume unbiased forecasts.
Can prove that for any $W$ , $E [L_{W} (y, \tilde{y})]$ is minimised by

$\tilde{y} = S {(S^{'} Σ^{- 1} S)}^{- 1} S^{'} Σ^{- 1} \hat{y}$ where $Σ$ is the forecast error covariance.

$Σ = E [(y - \hat{y}) (y - \hat{y})^{'}]$

Intuition

What theory does not tell us (yet)How to reconcile if we are primarily interested in a single series (or subset of series).
29

What theory does not tell us (yet)How to reconcile if we are primarily interested in a single series (or subset of series).
How to obtain improvements in forecast accuracy for all series.
29

What theory does not tell us (yet)How to reconcile if we are primarily interested in a single series (or subset of series).
How to obtain improvements in forecast accuracy for all series.
How to improve forecast accuracy for losses that are not quadratic.
29

What theory does not tell us (yet)How to reconcile if we are primarily interested in a single series (or subset of series).
How to obtain improvements in forecast accuracy for all series.
How to improve forecast accuracy for losses that are not quadratic.
Nonetheless, the above do often (but not always) hold empirically
29

Model averagingConsider the most simple hierarchy T=A+BT=A+B
There are two forecasts for AADirect ^AA^
Indirect ^T−^BT^−B^

Reconciliation is a model average between the direct and indirect forecasts (Hollyman, Petropoulos, and Tipping, 2021)
30

Model averagingConsider the most simple hierarchy T=A+BT=A+B
There are two forecasts for AADirect ^AA^
Indirect ^T−^BT^−B^

Reconciliation is a model average between the direct and indirect forecasts (Hollyman, Petropoulos, and Tipping, 2021)
Model averaging is a relatively well understood problem.
30

Other interesting problemsDiscrete reconciliationZambon, Azzimonti, and Corani (2022)
Zhang, Panagiotelis, Li, and Kang (2023)

31

Other interesting problemsDiscrete reconciliationZambon, Azzimonti, and Corani (2022)
Zhang, Panagiotelis, Li, and Kang (2023)

Cross Temporal ReconciliationDi
Fonzo and Girolimetto (2023)

31

Other interesting problemsDiscrete reconciliationZambon, Azzimonti, and Corani (2022)
Zhang, Panagiotelis, Li, and Kang (2023)

Cross Temporal ReconciliationDi
Fonzo and Girolimetto (2023)

Machine LearningSpiliotis, Abolghasemi, Hyndman, Petropoulos, and Assimakopoulos (2021)
Burba and Chen (2021)

31

Probabilistic forecasts32

Early attemptsReconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
33

Early attemptsReconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)
33

Early attemptsReconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)This is only valid under perfectly dependent forecasts

33

Early attemptsReconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)This is only valid under perfectly dependent forecasts

Can notions of coherence and reconciliation be extended to probabilistic setting in a formal way?
See Panagiotelis, Gamakumara, Athanasopoulos, and Hyndman (2023)
33

Formal Definition: CoherenceLet (Rm,FRm,μ)(Rm,FRm,μ) be a probability triple 
Let s:Rm→ss:Rm→s where s(.)s(.) is premultiplication by the matrix SS.  
34

Formal Definition: Coherence

Let $(R^{m}, F_{R^{m}}, μ)$ be a probability triple
Let $s : R^{m} \to s$ where $s (.)$ is premultiplication by the matrix $S$ .
Coherent probabilistic forecast characterised by probability triple $(s, F_{s}, ν)$ where

$ν (s (B)) = μ (B) \forall B \in F_{R^{m}}$ and $s (B)$ is the image of $B$ under $s (.)$ .

In a picture

Formal Definition: Reconciliation

Let $(R^{n}, F_{R^{n}}, \hat{ν})$ be a probability triple corresponding to a base forecast.

Formal Definition: Reconciliation

Let $(R^{n}, F_{R^{n}}, \hat{ν})$ be a probability triple corresponding to a base forecast.
The reconciled forecast is characterised by

$\tilde{ν} (A) = \hat{ν} (ψ^{- 1} (A)) \forall A \in F_{s}$ and $ψ^{- 1} (A)$ is the pre-image of $A$ under $ψ (.)$ .

Formal Definition: Reconciliation

Let $(R^{n}, F_{R^{n}}, \hat{ν})$ be a probability triple corresponding to a base forecast.
The reconciled forecast is characterised by

$\tilde{ν} (A) = \hat{ν} (ψ^{- 1} (A)) \forall A \in F_{s}$ and $ψ^{- 1} (A)$ is the pre-image of $A$ under $ψ (.)$ .

The measure $\tilde{ν}$ is the pushforward of $\hat{ν}$

In a picture

In practice

If ${\hat{y}}^{[1]}, \dots, {\hat{y}}^{[L]}$ is a sample from some base probabilistic forecast, then ${\tilde{y}}^{[1]}, \dots, {\tilde{y}}^{[L]}$ is a sample from the reconciled forecast where

${\tilde{y}}^{[l]} = ψ ({\hat{y}}^{[l]}) \forall l = 1, \dots, L$

In practice

If ${\hat{y}}^{[1]}, \dots, {\hat{y}}^{[L]}$ is a sample from some base probabilistic forecast, then ${\tilde{y}}^{[1]}, \dots, {\tilde{y}}^{[L]}$ is a sample from the reconciled forecast where

${\tilde{y}}^{[l]} = ψ ({\hat{y}}^{[l]}) \forall l = 1, \dots, L$ Reconciling a sample from the base distribution gives a sample from the reconciled distribution.

In practice

If ${\hat{y}}^{[1]}, \dots, {\hat{y}}^{[L]}$ is a sample from some base probabilistic forecast, then ${\tilde{y}}^{[1]}, \dots, {\tilde{y}}^{[L]}$ is a sample from the reconciled forecast where

${\tilde{y}}^{[l]} = ψ ({\hat{y}}^{[l]}) \forall l = 1, \dots, L$ Reconciling a sample from the base distribution gives a sample from the reconciled distribution.

Some resultsFor elliptical distributions linear reconciliation leads to another elliptical distribution.The true predictive distribution can be recovered by linear reconciliation.
This need not be a projection.

39

Some resultsFor elliptical distributions linear reconciliation leads to another elliptical distribution.The true predictive distribution can be recovered by linear reconciliation.
This need not be a projection.

In the Gaussian case, Wickramasuriya (2023) proves that MinT is optimal w.r.t to log score.
39

Some resultsFor elliptical distributions linear reconciliation leads to another elliptical distribution.The true predictive distribution can be recovered by linear reconciliation.
This need not be a projection.

In the Gaussian case, Wickramasuriya (2023) proves that MinT is optimal w.r.t to log score.
Otherwise, resort to numerical methods.
Reconciliation mapping ψψ can be found by optimising with respect to a scoring rule.
39

An alternativeZambon et al. (2022) propose an alternative approach.
40

An alternativeZambon et al. (2022) propose an alternative approach.
Simply consider the base forecast, conditional on coherence being met.
40

An alternativeZambon et al. (2022) propose an alternative approach.
Simply consider the base forecast, conditional on coherence being met.
Sampling techniques (Importance sampling, MCMC) can be used to draw from the posterior.
40

An alternativeZambon et al. (2022) propose an alternative approach.
Simply consider the base forecast, conditional on coherence being met.
Sampling techniques (Importance sampling, MCMC) can be used to draw from the posterior.
Research into the theoretical properties (of both approaches) is ongoing.
40

Another alternativeRather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
41

Another alternativeRather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
Neural networks that includeA layer that guarantees coherence
Scoring rule as an objective function

41

Another alternativeRather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
Neural networks that includeA layer that guarantees coherence
Scoring rule as an objective function

Not always applicable in organisational settings.
41

Application areasMacroeconomicsComponents of GDP

Retail demandAmazon, Walmart

MortalityAggregate by geography or cause of death

HealthcareAccidents and Emergencies

42

Application areasMacroeconomicsComponents of GDP

Retail demandAmazon, Walmart

MortalityAggregate by geography or cause of death

HealthcareAccidents and Emergencies

And others
42

SummaryForecast reconciliation is an interesting area
Despite progress, important questions remain unansweredTheoretically
Methodologically
Empirically

43

SummaryForecast reconciliation is an interesting area
Despite progress, important questions remain unansweredTheoretically
Methodologically
Empirically

So jump on the bandwagon!
43

References

Athanasopoulos, G. et al. (2017). "Forecasting with temporal hierarchies". In: European Journal of Operational Research 262.1, pp. 60-74.

Ben Taieb, S. et al. (2021). "Hierarchical Probabilistic Forecasting of Electricity Demand With Smart Meter Data". In: Journal of the American Statistical Association 116, pp. 27-43.

Burba, D. et al. (2021). "A trainable reconciliation method for hierarchical time-series". URL: https://arxiv.org/abs/2101.01329.

Byron, R. P. (1978). "The estimation of large social account matrices". In: Journal of the Royal Statistical Society, Series A 141.3, pp. 359-367.

Di Fonzo, T. et al. (2023). "Cross-temporal forecast reconciliation: Optimal combination method and heuristic alternatives". In: International Journal of Forecasting 39.1, pp. 39-57.

Gross, C. W. et al. (1990). "Disaggregation methods to expedite product line forecasting". In: Journal of Forecasting 9.3, pp. 233-254.

Hollyman, R. et al. (2021). "Understanding forecast reconciliation". In: European Journal of Operational Research 294.1, pp. 149-160. DOI: 10.1016/j.ejor.2021.01.017.

References

Hyndman, R. J. et al. (2011). "Optimal combination forecasts for hierarchical time series". In: Computational Statistics and Data Analysis 55.9, pp. 2579-2589.

Jeon, J. et al. (2019). "Probabilistic forecast reconciliation with applications to wind power and electric load". In: European Journal of Operational Research 279.2, pp. 364-379.

Panagiotelis, A. et al. (2021). "Forecast reconciliation: A geometric view with new insights on bias correction". In: International Journal of Forecasting 37.1, pp. 343-359.

Panagiotelis, A. et al. (2023). "Probabilistic forecast reconciliation: properties, evaluation and score optimisation". In: European Journal of Operational Research 306.2, pp. 693-706.

Rangapuram, S. S. et al. (2021). "End-to-end learning of coherent probabilistic forecasts for hierarchical time series". In: Proceedings of the 38th International Conference on Machine Learning, PMLR 139. , pp. 8832-8843.

Schwarzkopf, A. B. et al. (1988). "Top-down versus bottom-up forecasting strategies". In: International Journal of Production Research 26 (11), pp. 1833-1843.

References

Spiliotis, E. et al. (2021). "Hierarchical forecast reconciliation with machine learning". In: Applied Soft Computing 112, p. 107756.

Stone, R. et al. (1942). "The precision of national income estimates". In: Review of Economic Studies 9.2, pp. 111-125. DOI: 10.2307/2967664.

Wickramasuriya, S. L. (2023). "Probabilistic forecast reconciliation under the Gaussian framework". In: Journal of Business & Economic Statistics, pp. 1-14.

Wickramasuriya, S. L. et al. (2019). "Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization". In: Journal of the American Statistical Association 114.526, pp. 804-819.

Zambon, L. et al. (2022). "Efficient probabilistic reconciliation of forecasts for real-valued and count time series". URL: https://arxiv.org/abs/2210.02286.

Zhang, B. et al. (2023). "Discrete forecast reconciliation". URL: https://arxiv.org/abs/2305.18809.

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Forecast Reconciliation: A Review

Anastasios Panagiotelis

16th August, 2023

Joint work with...

Motivation

Hierarchical Time Series

Hierarchical Time Series

Hierarchical Time Series

Temporal

Grouped Time Series

Examples

Definition of Hierarchies

Electricity Example

Electricity Example

Electricity Example

Electricity Example

Electricity Example

Electricity Example

Electricity Example

Electricity Data (link)

Main takeaways

Main takeaways

Main takeaways

Main takeaways

What is reconciliation?

Traditional approaches

Traditional approaches

Traditional approaches

Traditional approaches

Traditional approaches

Traditional approaches

Traditional approaches

What is reconciliation?

What is reconciliation?

What is reconciliation?

What is reconciliation?

What is reconciliation?

In the beginning

In the beginning

In the beginning

The regression interpretation

The summing matrix

The summing matrix

The solution

The solution

The solution

Other reconciliation methods

Geometric Intepretation

Geometric Intepretation

Why it works?

Why it works?

Intuition

Optimality

Intuition

Intuition

Intuition

Intuition

What theory does not tell us (yet)

What theory does not tell us (yet)

What theory does not tell us (yet)

What theory does not tell us (yet)

Model averaging

Model averaging

Other interesting problems

Other interesting problems

Other interesting problems

Probabilistic forecasts

Early attempts

Early attempts

Early attempts

Early attempts

Formal Definition: Coherence

Formal Definition: Coherence

In a picture

Formal Definition: Reconciliation

Formal Definition: Reconciliation

Formal Definition: Reconciliation

In a picture

In practice

In practice

Forecast Reconciliation:
A Review