+ - 0:00:00
Notes for current slide
Notes for next slide

Forecast Reconciliation:
A Review

Anastasios Panagiotelis

16th August, 2023

1

Joint work with...

  • Rob Hyndman
  • George Athanasopoulos
  • Nikos Kourentzes
  • Puwasala Gamakumara
  • Mohamaed Affan
  • Han Li
  • Hong Li
  • Yang Lu
  • Florian Eckert
  • Fotios Petropoulos
  • Jooyoung Jeon
  • Bohan Zhang
  • Yanfei Kang
  • Feng Li
2

Motivation

3

Hierarchical Time Series

  • Predictions of multiple variables needed.
4

Hierarchical Time Series

  • Predictions of multiple variables needed.
  • Variables follow linear constraints.
4

Hierarchical Time Series

  • Predictions of multiple variables needed.
  • Variables follow linear constraints.
  • Forecast store level sales and aggregates.
4

Temporal

5

Grouped Time Series

6

Examples

  • Tourism data grouped by:
    • Region
    • Purpose of travel
  • Prison population data grouped by:
    • State
    • Gender
7

Definition of Hierarchies

  • Alternative definitions of 'Hierarchical Time Series':
    • Any collection of n variables with q linear constraints.
    • Any collection of n variables with support on a m-dimensional linear subspace n=m+q.
  • These are the same. Notably
    • They do not need to involve aggregation
    • They do not even need to be time series
8

Electricity Example

  • Total daily electricity in Australian NEM
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
  • Renewable can be broken down
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
  • Renewable can be broken down
    • Solar
    • Wind
    • etc.
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
  • Renewable can be broken down
    • Solar
    • Wind
    • etc.
  • Solar can be broken down into
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
  • Renewable can be broken down
    • Solar
    • Wind
    • etc.
  • Solar can be broken down into
    • Solar rooftop
    • Solar utility
9

Electricity Example

  • Total daily electricity in Australian NEM
    • Renewable
    • Non-renewable
  • Renewable can be broken down
    • Solar
    • Wind
    • etc.
  • Solar can be broken down into
    • Solar rooftop
    • Solar utility
  • Data sourced from Open NEM.
9

Electricity Data (link)

10

Main takeaways

  • Data have different characteristics regarding
11

Main takeaways

  • Data have different characteristics regarding
    • Trends
    • Seasonality
    • Spikes
    • Signal to noise ratio
11

Main takeaways

  • Data have different characteristics regarding
    • Trends
    • Seasonality
    • Spikes
    • Signal to noise ratio
  • Hard to come up with a single multivariate model.
11

Main takeaways

  • Data have different characteristics regarding
    • Trends
    • Seasonality
    • Spikes
    • Signal to noise ratio
  • Hard to come up with a single multivariate model.
  • Even harder to do so while accounting for constraints.
11

What is reconciliation?

12

Traditional approaches

  • Single level approaches
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
    • Top Down (Gross and Sohl, 1990).
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
    • Top Down (Gross and Sohl, 1990).
    • Middle Out
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
    • Top Down (Gross and Sohl, 1990).
    • Middle Out
  • Top down do not exploit information at bottom levels.
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
    • Top Down (Gross and Sohl, 1990).
    • Middle Out
  • Top down do not exploit information at bottom levels.
  • Bottom up can suffer from the noisiness of bottom level series.
13

Traditional approaches

  • Single level approaches
    • Bottom Up (Schwarzkopf, Tersine, and Morris, 1988).
    • Top Down (Gross and Sohl, 1990).
    • Middle Out
  • Top down do not exploit information at bottom levels.
  • Bottom up can suffer from the noisiness of bottom level series.
  • These approaches do not work for more general constraints.
13

What is reconciliation?

  • Traditional approaches only use forecasts at a single level.
14

What is reconciliation?

  • Traditional approaches only use forecasts at a single level.
  • Question: Why not produce forecasts for all series?
14

What is reconciliation?

  • Traditional approaches only use forecasts at a single level.
  • Question: Why not produce forecasts for all series?
    • Answer: They may not respect constraints
14

What is reconciliation?

  • Traditional approaches only use forecasts at a single level.
  • Question: Why not produce forecasts for all series?
    • Answer: They may not respect constraints
  • Solution: Adjust forecasts ex post.
14

What is reconciliation?

  • Traditional approaches only use forecasts at a single level.
  • Question: Why not produce forecasts for all series?
    • Answer: They may not respect constraints
  • Solution: Adjust forecasts ex post.
  • This is called forecast reconciliation.
14

In the beginning

  • Some very early examples involving national accounts data
    • Stone, Champernowne, and Meade (1942)
    • Byron (1978)
15

In the beginning

  • Some very early examples involving national accounts data
    • Stone, Champernowne, and Meade (1942)
    • Byron (1978)
  • Literature becomes more focused on forecasting with Hyndman, Ahmed, Athanasopoulos, and Shang (2011)
15

In the beginning

  • Some very early examples involving national accounts data
    • Stone, Champernowne, and Meade (1942)
    • Byron (1978)
  • Literature becomes more focused on forecasting with Hyndman, Ahmed, Athanasopoulos, and Shang (2011)
  • How did they think about the problem?
15

The regression interpretation

  • Consider the n-vector of initial (so-called base) forecasts denoted y^.
  • Let β be an m-vector vector of 'true' bottom level forecasts.
  • Consider a regression

y^=Sβ+ϵ

  • What is S?
16

The summing matrix

17

The summing matrix

S=(1111110000111000010000100001)

18

The solution

  • Can get an 'estimate' of β via least squares

β^=(SS)1Sy^

19

The solution

  • Can get an 'estimate' of β via least squares

β^=(SS)1Sy^

  • Find a full set of coherent forecasts as

y~=S(SS)1Sy^

19

The solution

  • Can get an 'estimate' of β via least squares

β^=(SS)1Sy^

  • Find a full set of coherent forecasts as

y~=S(SS)1Sy^

  • Here y~ is the OLS reconciled forecast.
19

Other reconciliation methods

  • Other reconciliation methods take the form

y~=S(SWS)1SWy^

  • Different choices of W
    • Diagonal (Athanasopoulos, Hyndman, Kourentzes, and Petropoulos, 2017)
    • Error covariance or 'MinT' (Wickramasuriya, Athanasopoulos, and Hyndman, 2019)
20

Geometric Intepretation

  • Base forecasts y^ lie somewhere in Rn
  • Realisation y lies on a linear subspace s that is spanned by the columns of S.
  • Forecasts need to be reconciled via a mapping y~=ψ(y^).
21

Geometric Intepretation

  • Base forecasts y^ lie somewhere in Rn
  • Realisation y lies on a linear subspace s that is spanned by the columns of S.
  • Forecasts need to be reconciled via a mapping y~=ψ(y^).

21

Why it works?

  • Consider a loss function

LW(y,y˘)=(yy˘)W(yy˘)

22

Why it works?

  • Consider a loss function

LW(y,y˘)=(yy˘)W(yy˘)

  • Can prove that for y~=S(SWS)1SWy^

LW(y,y~)LW(y,y^)

  • Proof in Panagiotelis, Athanasopoulos, Gamakumara, and Hyndman (2021)
22

Intuition

23

Optimality

  • Assume unbiased forecasts.
  • Can prove that for any W, E[LW(y,y~)] is minimised by

y~=S(SΣ1S)1SΣ1y^ where Σ is the forecast error covariance.

Σ=E[(yy^)(yy^)]

24

Intuition

25

Intuition

26

Intuition

27

Intuition

28

What theory does not tell us (yet)

  • How to reconcile if we are primarily interested in a single series (or subset of series).
29

What theory does not tell us (yet)

  • How to reconcile if we are primarily interested in a single series (or subset of series).
  • How to obtain improvements in forecast accuracy for all series.
29

What theory does not tell us (yet)

  • How to reconcile if we are primarily interested in a single series (or subset of series).
  • How to obtain improvements in forecast accuracy for all series.
  • How to improve forecast accuracy for losses that are not quadratic.
29

What theory does not tell us (yet)

  • How to reconcile if we are primarily interested in a single series (or subset of series).
  • How to obtain improvements in forecast accuracy for all series.
  • How to improve forecast accuracy for losses that are not quadratic.
  • Nonetheless, the above do often (but not always) hold empirically
29

Model averaging

  • Consider the most simple hierarchy T=A+B
  • There are two forecasts for A
    • Direct A^
    • Indirect T^B^
  • Reconciliation is a model average between the direct and indirect forecasts (Hollyman, Petropoulos, and Tipping, 2021)
30

Model averaging

  • Consider the most simple hierarchy T=A+B
  • There are two forecasts for A
    • Direct A^
    • Indirect T^B^
  • Reconciliation is a model average between the direct and indirect forecasts (Hollyman, Petropoulos, and Tipping, 2021)
  • Model averaging is a relatively well understood problem.
30

Other interesting problems

  • Discrete reconciliation
    • Zambon, Azzimonti, and Corani (2022)
    • Zhang, Panagiotelis, Li, and Kang (2023)
31

Other interesting problems

  • Discrete reconciliation
    • Zambon, Azzimonti, and Corani (2022)
    • Zhang, Panagiotelis, Li, and Kang (2023)
  • Cross Temporal Reconciliation
    • Di Fonzo and Girolimetto (2023)
31

Other interesting problems

  • Discrete reconciliation
    • Zambon, Azzimonti, and Corani (2022)
    • Zhang, Panagiotelis, Li, and Kang (2023)
  • Cross Temporal Reconciliation
    • Di Fonzo and Girolimetto (2023)
  • Machine Learning
    • Spiliotis, Abolghasemi, Hyndman, Petropoulos, and Assimakopoulos (2021)
    • Burba and Chen (2021)
31

Probabilistic forecasts

32

Early attempts

  • Reconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
33

Early attempts

  • Reconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
  • Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)
33

Early attempts

  • Reconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
  • Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)
    • This is only valid under perfectly dependent forecasts
33

Early attempts

  • Reconcile means but otherwise bottom up (Ben Taieb, Taylor, and Hyndman, 2021)
  • Reconcile quantiles (Jeon, Panagiotelis, and Petropoulos, 2019)
    • This is only valid under perfectly dependent forecasts
  • Can notions of coherence and reconciliation be extended to probabilistic setting in a formal way?
  • See Panagiotelis, Gamakumara, Athanasopoulos, and Hyndman (2023)
33

Formal Definition: Coherence

  • Let (Rm,FRm,μ) be a probability triple
  • Let s:Rms where s(.) is premultiplication by the matrix S.
34

Formal Definition: Coherence

  • Let (Rm,FRm,μ) be a probability triple
  • Let s:Rms where s(.) is premultiplication by the matrix S.
  • Coherent probabilistic forecast characterised by probability triple (s,Fs,ν) where

ν(s(B))=μ(B)BFRm and s(B) is the image of B under s(.).

34

In a picture

35

Formal Definition: Reconciliation

Let (Rn,FRn,ν^) be a probability triple corresponding to a base forecast.

36

Formal Definition: Reconciliation

Let (Rn,FRn,ν^) be a probability triple corresponding to a base forecast.
The reconciled forecast is characterised by

ν~(A)=ν^(ψ1(A))AFs and ψ1(A) is the pre-image of A under ψ(.).

36

Formal Definition: Reconciliation

Let (Rn,FRn,ν^) be a probability triple corresponding to a base forecast.
The reconciled forecast is characterised by

ν~(A)=ν^(ψ1(A))AFs and ψ1(A) is the pre-image of A under ψ(.).

  • The measure ν~ is the pushforward of ν^
36

In a picture

37

In practice

If y^[1],,y^[L] is a sample from some base probabilistic forecast, then y~[1],,y~[L] is a sample from the reconciled forecast where

y~[l]=ψ(y^[l])l=1,,L

38

In practice

If y^[1],,y^[L] is a sample from some base probabilistic forecast, then y~[1],,y~[L] is a sample from the reconciled forecast where

y~[l]=ψ(y^[l])l=1,,L Reconciling a sample from the base distribution gives a sample from the reconciled distribution.

38

In practice

If y^[1],,y^[L] is a sample from some base probabilistic forecast, then y~[1],,y~[L] is a sample from the reconciled forecast where

y~[l]=ψ(y^[l])l=1,,L Reconciling a sample from the base distribution gives a sample from the reconciled distribution.

38

Some results

  • For elliptical distributions linear reconciliation leads to another elliptical distribution.
    • The true predictive distribution can be recovered by linear reconciliation.
    • This need not be a projection.
39

Some results

  • For elliptical distributions linear reconciliation leads to another elliptical distribution.
    • The true predictive distribution can be recovered by linear reconciliation.
    • This need not be a projection.
  • In the Gaussian case, Wickramasuriya (2023) proves that MinT is optimal w.r.t to log score.
39

Some results

  • For elliptical distributions linear reconciliation leads to another elliptical distribution.
    • The true predictive distribution can be recovered by linear reconciliation.
    • This need not be a projection.
  • In the Gaussian case, Wickramasuriya (2023) proves that MinT is optimal w.r.t to log score.
  • Otherwise, resort to numerical methods.
  • Reconciliation mapping ψ can be found by optimising with respect to a scoring rule.
39

An alternative

  • Zambon et al. (2022) propose an alternative approach.
40

An alternative

  • Zambon et al. (2022) propose an alternative approach.
  • Simply consider the base forecast, conditional on coherence being met.
40

An alternative

  • Zambon et al. (2022) propose an alternative approach.
  • Simply consider the base forecast, conditional on coherence being met.
  • Sampling techniques (Importance sampling, MCMC) can be used to draw from the posterior.
40

An alternative

  • Zambon et al. (2022) propose an alternative approach.
  • Simply consider the base forecast, conditional on coherence being met.
  • Sampling techniques (Importance sampling, MCMC) can be used to draw from the posterior.
  • Research into the theoretical properties (of both approaches) is ongoing.
40

Another alternative

  • Rather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
41

Another alternative

  • Rather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
  • Neural networks that include
    • A layer that guarantees coherence
    • Scoring rule as an objective function
41

Another alternative

  • Rather than a 2 step approach consider an end to end approach Rangapuram, Werner, Benidis, Mercado, Gasthaus, and Januschowski (2021)
  • Neural networks that include
    • A layer that guarantees coherence
    • Scoring rule as an objective function
  • Not always applicable in organisational settings.
41

Application areas

  • Macroeconomics
    • Components of GDP
  • Retail demand
    • Amazon, Walmart
  • Mortality
    • Aggregate by geography or cause of death
  • Healthcare
    • Accidents and Emergencies
42

Application areas

  • Macroeconomics
    • Components of GDP
  • Retail demand
    • Amazon, Walmart
  • Mortality
    • Aggregate by geography or cause of death
  • Healthcare
    • Accidents and Emergencies
  • And others
42

Summary

  • Forecast reconciliation is an interesting area
  • Despite progress, important questions remain unanswered
    • Theoretically
    • Methodologically
    • Empirically
43

Summary

  • Forecast reconciliation is an interesting area
  • Despite progress, important questions remain unanswered
    • Theoretically
    • Methodologically
    • Empirically
  • So jump on the bandwagon!
43

References

Athanasopoulos, G. et al. (2017). "Forecasting with temporal hierarchies". In: European Journal of Operational Research 262.1, pp. 60-74.

Ben Taieb, S. et al. (2021). "Hierarchical Probabilistic Forecasting of Electricity Demand With Smart Meter Data". In: Journal of the American Statistical Association 116, pp. 27-43.

Burba, D. et al. (2021). "A trainable reconciliation method for hierarchical time-series". URL: https://arxiv.org/abs/2101.01329.

Byron, R. P. (1978). "The estimation of large social account matrices". In: Journal of the Royal Statistical Society, Series A 141.3, pp. 359-367.

Di Fonzo, T. et al. (2023). "Cross-temporal forecast reconciliation: Optimal combination method and heuristic alternatives". In: International Journal of Forecasting 39.1, pp. 39-57.

Gross, C. W. et al. (1990). "Disaggregation methods to expedite product line forecasting". In: Journal of Forecasting 9.3, pp. 233-254.

Hollyman, R. et al. (2021). "Understanding forecast reconciliation". In: European Journal of Operational Research 294.1, pp. 149-160. DOI: 10.1016/j.ejor.2021.01.017.

44

References

Hyndman, R. J. et al. (2011). "Optimal combination forecasts for hierarchical time series". In: Computational Statistics and Data Analysis 55.9, pp. 2579-2589.

Jeon, J. et al. (2019). "Probabilistic forecast reconciliation with applications to wind power and electric load". In: European Journal of Operational Research 279.2, pp. 364-379.

Panagiotelis, A. et al. (2021). "Forecast reconciliation: A geometric view with new insights on bias correction". In: International Journal of Forecasting 37.1, pp. 343-359.

Panagiotelis, A. et al. (2023). "Probabilistic forecast reconciliation: properties, evaluation and score optimisation". In: European Journal of Operational Research 306.2, pp. 693-706.

Rangapuram, S. S. et al. (2021). "End-to-end learning of coherent probabilistic forecasts for hierarchical time series". In: Proceedings of the 38th International Conference on Machine Learning, PMLR 139. , pp. 8832-8843.

Schwarzkopf, A. B. et al. (1988). "Top-down versus bottom-up forecasting strategies". In: International Journal of Production Research 26 (11), pp. 1833-1843.

45

References

Spiliotis, E. et al. (2021). "Hierarchical forecast reconciliation with machine learning". In: Applied Soft Computing 112, p. 107756.

Stone, R. et al. (1942). "The precision of national income estimates". In: Review of Economic Studies 9.2, pp. 111-125. DOI: 10.2307/2967664.

Wickramasuriya, S. L. (2023). "Probabilistic forecast reconciliation under the Gaussian framework". In: Journal of Business & Economic Statistics, pp. 1-14.

Wickramasuriya, S. L. et al. (2019). "Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization". In: Journal of the American Statistical Association 114.526, pp. 804-819.

Zambon, L. et al. (2022). "Efficient probabilistic reconciliation of forecasts for real-valued and count time series". URL: https://arxiv.org/abs/2210.02286.

Zhang, B. et al. (2023). "Discrete forecast reconciliation". URL: https://arxiv.org/abs/2305.18809.

46

Joint work with...

  • Rob Hyndman
  • George Athanasopoulos
  • Nikos Kourentzes
  • Puwasala Gamakumara
  • Mohamaed Affan
  • Han Li
  • Hong Li
  • Yang Lu
  • Florian Eckert
  • Fotios Petropoulos
  • Jooyoung Jeon
  • Bohan Zhang
  • Yanfei Kang
  • Feng Li
2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow