Electricity price forecasting


Electricity price forecasting is a branch of energy forecasting which focuses on predicting the spot and forward prices in wholesale electricity markets. Over the last 15 years electricity price forecasts have become a fundamental input to energy companies’ decision-making mechanisms at the corporate level.

Background

Since the early 1990s, the process of deregulation and the introduction of competitive electricity markets have been reshaping the landscape of the traditionally monopolistic and government-controlled power sectors. Throughout Europe, North America and Australia, electricity is now traded under market rules using spot and derivative contracts. However, electricity is a very special commodity: it is economically non-storable and power system stability requires a constant balance between production and consumption. At the same time, electricity demand depends on weather and the intensity of business and everyday activities. These unique characteristics lead to price dynamics not observed in any other market, exhibiting daily, weekly and often annual seasonality and abrupt, short-lived and generally unanticipated price spikes.
Extreme price volatility, which can be up to two orders of magnitude higher than that of any other commodity or financial asset, has forced market participants to hedge not only volume but also price risk. Price forecasts from a few hours to a few months ahead have become of particular interest to power portfolio managers. A power market company able to forecast the volatile wholesale prices with a reasonable level of accuracy can adjust its bidding strategy and its own production or consumption schedule in order to reduce the risk or maximize the profits in day-ahead trading. A ballpark estimate of savings from a 1% reduction in the mean absolute percentage error of short-term price forecasts is $300,000 per year for a utility with 1GW peak load.

Taxonomy of modeling approaches

A variety of methods and ideas have been tried for EPF over the last 15 years, with varying degrees of success. They can be broadly classified into six groups.

Multi-agent models

Multi-agent models simulate the operation of a system of heterogeneous agents interacting with each other, and build the price process by matching the demand and supply in the market. This class includes cost-based models, equilibrium or game theoretic approaches and agent-based models.
Multi-agent models generally focus on qualitative issues rather than quantitative results. They may provide insights as to whether or not prices will be above marginal costs, and how this might influence the players’ outcomes. However, they pose problems if more quantitative conclusions have to be drawn, particularly if electricity prices have to be predicted with a high level of precision.

Fundamental models

Fundamental methods try to capture the basic physical and economic relationships which are present in the production and trading of electricity. The functional associations between fundamental drivers are postulated, and the fundamental inputs are modeled and predicted independently, often via statistical, reduced-form or computational intelligence techniques. In general, two subclasses of fundamental models can be identified: parameter rich models and parsimonious structural models of supply and demand.
Two major challenges arise in the practical implementation of fundamental models: data availability and incorporation of stochastic fluctuations of the fundamental drivers. In building the model, we make specific assumptions about physical and economic relationships in the marketplace, and therefore the price projections generated by the models are very sensitive to violations of these assumptions.

Reduced-form models

Reduced-form models characterize the statistical properties of electricity prices over time, with the ultimate objective of derivatives valuation and risk management. Their main intention is not to provide accurate hourly price forecasts, but rather to replicate the main characteristics of daily electricity prices, like marginal distributions at future time points, price dynamics, and correlations between commodity prices. If the price process chosen is not appropriate for capturing the main properties of electricity prices, the results from the model are likely to be unreliable. However, if the model is too complex, the computational burden will prevent its use on-line in trading departments. Depending on the type of market under consideration, reduced-form models can be classified as:
Statistical methods forecast the current price by using a mathematical combination of the previous prices and/or previous or current values of exogenous factors, typically consumption and production figures, or weather variables. The two most important categories are additive and multiplicative models. They differ in whether the predicted price is the sum of a number of components or the product of a number of factors. The former are far more popular, but the two are closely related - a multiplicative model for prices can be transformed into an additive model for log-prices. Statistical models are attractive because some physical interpretation may be attached to their components, thus allowing engineers and system operators to understand their behavior. They are often criticized for their limited ability to model the nonlinear behavior of electricity prices and related fundamental variables. However, in practical applications, their performances are not worse than those of the non-linear computational intelligence methods. For instance, in the load forecasting track of the Global Energy Forecasting Competition attracting hundreds of participants worldwide, the top four winning entries used regression-type models.
architectures that are most popular in EPF. Input nodes are denoted by filled circles, output nodes by empty circles, and nodes in the hidden layer by empty circles with a dashed outline. The activation functions for RBF networks are radial basis functions, whereas multi-layer perceptrons typically use piecewise linear or sigmoid activation functions.
Statistical models constitute a very rich class which includes:
Computational intelligence techniques combine elements of learning, evolution and fuzziness to create approaches that are capable of adapting to complex dynamic systems, and may be regarded as "intelligent" in this sense. Artificial neural networks, fuzzy systems and support vector machines are unquestionably the main classes of computational intelligence techniques in EPF. Their major strength is the ability to handle complexity and non-linearity. In general, computational intelligence methods are better at modeling these features of electricity prices than the statistical techniques. At the same time, this flexibility is also their major weakness. The ability to adapt to nonlinear, spiky behaviors will not necessarily result in better point or probabilistic forecasts.

Hybrid models

Many of the modeling and price forecasting approaches considered in the literature are hybrid solutions, combining techniques from two or more of the groups listed above. Their classification is non-trivial, if possible at all.
As an example of hybrid model AleaModel combines Neural Networks and Box Jenkins models.

Forecasting horizons

It is customary to talk about short-, medium- and long-term forecasting, but there is no consensus in the literature as to what the thresholds should actually be:
In his extensive review paper, Weron looks ahead and speculates on the directions EPF will or should take over the next decade or so:

Fundamental price drivers and input variables

Seasonality

A key point in electricity spot price modeling and forecasting is the appropriate treatment of seasonality. The electricity price exhibits seasonality at three levels: the daily and weekly, and to some extent - the annual. In short-term forecasting, the annual or long-term seasonality is usually ignored, but the daily and weekly patterns are of prime importance. This, however, may not be the right approach. As Nowotarski and Weron have recently shown, decomposing a series of electricity prices into a long-term seasonal and a stochastic component, modeling them independently and combining their forecasts can bring - contrary to a common belief - an accuracy gain compared to an approach in which a given model is calibrated to the prices themselves.
In mid-term forecasting, the daily patterns become less relevant and most EPF models work with average daily prices. However, the long-term trend-cycle component plays a crucial role. Its misspecification can introduce bias, which may lead to a bad estimate of the mean reversion level or of the price spike intensity and severity, and consequently, to underestimating the risk. Finally, in the long term, when the time horizon is measured in years, the daily, weekly and even annual seasonality may be ignored, and long-term trends dominate. Adequate treatment - both in-sample and out-of-sample - of seasonality has not been given enough attention in the literature so far.

Variable selection

Another crucial issue in electricity price forecasting is the appropriate choice of explanatory variables. Apart from historical electricity prices, the current spot price is dependent on a large set of fundamental drivers, including system loads, weather variables, fuel costs, the reserve margin and information about scheduled maintenance and forced outages. Although "pure price" models are sometimes used for EPF, in the most common day-ahead forecasting scenario most authors select a combination of these fundamental drivers, based on the heuristics and experience of the forecaster. Very rarely has an automated selection or shrinkage procedure been carried out in EPF, especially for a large set of initial explanatory variables. However, the machine learning literature provides viable tools that can be broadly classified into two categories:
Some of these techniques have been utilized in the context of EPF:
but their use is not common. Further development and employment of methods for selecting the most effective input variables - from among the past electricity prices, as well as the past and predicted values of the fundamental drivers - is needed.

Spike forecasting and the reserve margin

When predicting spike occurrences or spot price volatility, one of the most influential fundamental variables is the reserve margin, also called surplus generation. It relates the available capacity,, to the demand,, at a given moment in time. The traditional engineering notion of the reserve margin defines it as the difference between the two, i.e.,, but many authors prefer to work with dimensionless ratios, or the so-called capacity utilization. Its rare application in EPF can be justified only by the difficulty of obtaining good quality reserve margin data. Given that more and more system operators are disclosing such information nowadays, reserve margin data should be playing a significant role in EPF in the near future.

Probabilistic forecasts

The use of prediction intervals and densities, or probabilistic forecasting, has become much more common over the past three decades, as practitioners have come to understand the limitations of point forecasts. Despite the bold move by the organizers of the Global Energy Forecasting Competition 2014 to require the participants to submit forecasts of the 99 percentiles of the predictive distribution and not the point forecasts as in the 2012 edition, this does not seem to be a common case in EPF as yet.
If PIs are computed at all, they usually are distribution-based or empirical. The latter method resembles the estimation of the Value-at-Risk via historical simulation, and consists of computing sample quantiles of the empirical distribution of the one-step-ahead prediction errors. A new forecast combination technique has been introduced recently in the context of EPF. Quantile Regression Averaging involves applying quantile regression to the point forecasts of a small number of individual forecasting models or experts, hence allows to leverage existing development of point forecasting.

Combining forecasts

s, also known as combining forecasts, forecast averaging or model averaging and committee machines, ensemble averaging or expert aggregation, are predictions of the future that are created by combining together several separate forecasts which have often been created using different methodologies. Despite their popularity in econometrics, averaged forecasts have not been used extensively in the context of electricity markets to date. There is some limited evidence on the adequacy of combining forecasts of electricity demand, but it was only very recently that combining was used in EPF and only for point forecasts. Combining probabilistic forecasts is much less popular, even in econometrics in general, mainly because of the increased complexity of the problem. Since Quantile Regression Averaging allows to leverage existing development of point forecasting, it is particularly attractive from a practical point of view and may become a popular tool in EPF in the near future.

Multivariate factor models

The literature on forecasting daily electricity prices has concentrated largely on models that use only information at the aggregated level. On the other hand, the very rich body of literature on forecasting intra-day prices has used disaggregated data, but generally has not explored the complex dependence structure of the multivariate price series. If we want to explore the structure of intra-day electricity prices, we need to use dimension reduction methods; for instance, factor models with factors estimated as principal components. Empirical evidence indicates that there are forecast improvements from incorporating disaggregated data for predicting daily system prices, especially when the forecast horizon exceeds one week. With the increase of computational power, the real-time calibration of these complex models will become feasible and we may expect to see more EPF applications of the multivariate framework in the coming years.

A universal test ground

All major review publications conclude that there are problems with comparing the methods developed and used in the EPF literature. This is due mainly to the use of different datasets, different software implementations of the forecasting models and different error measures, but also to the lack of statistical rigor in many studies. This calls for a comprehensive, thorough study involving the same datasets, the same robust error evaluation procedures, and statistical testing of the significance of one model's outperformance of another. To some extent, the Global Energy Forecasting Competition 2014 has addressed these issues. Yet more has to be done. A selection of the better-performing measures should be used either exclusively or in conjunction with the more popular ones. The empirical results should be further tested for the significance of the differences in forecasting accuracies of the models.