Improving forecast accuracy through the intelligent application of AI

Person controlling a digital dashboard.

New machine-learning techniques need to be applied very carefully if we are to improve forecasting accuracy at a time of increasing air traffic volatility.

In the world of air traffic forecasting there is a race underway. On one side the industry is becoming less predictable, more prone to sudden swings in industrial, economic and environmental turbulence. On the other, our understanding of the key patterns which provide indications of future demand is improving. New artificial intelligence (AI) techniques and associated improvements in the understanding of air traffic dynamics mean that, despite increased volatility, our ability to improve the accuracy of forecasts is making small but almost constant improvements.

“We’re always on the lookout for new techniques that are going to improve the forecast, but techniques are not the whole picture. It is a combination of techniques, data and collaboration with local experts; and it’s these three working together that is really making a difference.

“With an economic forecast you are predicting what the outcome will be from the interaction of millions of people and thousands upon thousands of enterprises. When we’re forecasting air traffic movements in Europe, we’re dealing with relatively small numbers of countries, airports, aircraft operators and aircraft types – which means that one event or one decision by a carrier can significantly change the traffic in several countries. So it’s almost something that’s not statistical.”

Dr David Marsh Head of the Forecast and Network Intelligence Unit at EUROCONTROL

For the last few years AI has given EUROCONTROL a set of highly capable new tools in their predictive analysis toolbox – but applying these tools in a way which generates real improvements in forecasting accuracy is a highly complex task. For example, one of the most popular AI-based analytical forecasting tools is XGBoost, an ensemble learning method, relying on a powerful algorithm with scalability that drives fast learning through parallel and distributed computing and offers efficient memory usage. EUROCONTROL tried applying this technique to improve overflight trend forecasts – but the results were disappointing.

“There’s an important lesson here: with AI and data science you have to be prepared to try something and recognise that it’s not working and stop until you’ve got a better idea or new data comes along or there are some other techniques that you can try,” says David Marsh. “Because not every AI project is successful and if you’re going to innovate, then some of the time it’s not going to work. Time and again in forecasting we find that we come up with a good method which seems to work in a small example, but when we apply it across the network it improves predictions in 51% of States but produces worse results for the rest; worse still, you can’t tell in advance which State will be better and which worse. Unless there is a proven, consistent improvement we cannot adopt it.”

But when AI does work, the results can be impressive.

“The forecasting system is based on a wide number of different components. We isolate the trends, the relationships between the different factors into components, and ultimately improve the quality and performance of the final forecast. For example, we have put a great deal of effort into forecasting zone pair flows, to determine exactly how many more flights there will be between one zone and another at some time in the future. We have contracted a company to use new machine-learning techniques to help us with this. They have brought more input into the calculations than just the core gross domestic product (GDP) data we were using and the results have been very positive.”

Dr Claire Leleu Forecasting Manager, Statistics and Forecast (STATFOR) department EUROCONTROL

EUROCONTROL tested this new AI approach on seven traffic flows across the North Atlantic. The data provided was weekly traffic records stretching back to 2010. On these pairs, the machine-learning process reduced the median absolute error, by between 8.5% to even 71% for some pairs compared to the STATFOR median absolute error, for a specific year.

According to David Marsh: “They adopted a classic AI machinelearning approach, which analyses thousands of different types of exogeneous data, more than any analyst could make sense of. The machine-learning system selects the most relevant data sets, producing something more accurate than if the forecast was based on core GDP data. As a result we are re-engineering, restructuring the way we undertake our forecasts to exploit this.”

But relying too much on machine learning can be a dangerous business, EUROCONTROL has found. EUROCONTROL has been using a highly automated, long-established machine-learning system for many years, implementing the Autoregressive Integrated Moving Average forecasting approach (ARIMA).

But even after the input data has been validated and checked the system can still generate what forecasters call “monsters” – a few results out of the 10,000 generated which are absurdly extreme. At the moment, cleaning up the monsters is a human-based activity; they have trained an automated system to spot absurdities, but defeating them is not easy.

Interested? Subscribe!

Never miss out on the latest developments from EUROCONTROL

This is where EUROCONTROL has developed a specialist expertise – identifying the most effective mix of AI and other modelling techniques then balancing this with expert, human analysis. The Agency does not create new statistical modelling techniques, but it does combine them in novel ways into a framework which is unique. As a result, the forecasts produced by STATFOR are both continuously improving and consistently more accurate than comparative industry performance. At the moment, it seems, STATFOR is slightly ahead of the industry’s volatility challenge.

“We believe that you have to have good data, good techniques and good collaborators who know the local situation to produce the most accurate forecasts,” says David Marsh. “AI would struggle to synthesise the information we get from local experts across Europe."

“Maybe not everything should be AI,” says Claire Leleu. “At the moment we can control everything in our toolbox; we know that if we have this input and press the button we will get a certain output which is linked to what we already know. But with the quantity of data you can inject into your systems with AI, that will no longer be possible. Having said that we do need to benefit from these new techniques when it comes to the problem of volatility. Routes can change for many reasons and the explanations are not necessarily clear, but it’s very important to be able to figure out what’s behind the volatility.”

As turbulence within the European air travel market is predicted to grow over the coming years, the importance of identifying and exploiting new AI techniques – and then learning how best to apply them – will also increase.

“It’s perhaps easier to find reasons as to why things have happened than to project them forwards, but one of the lessons from the high-volume input study that we have done recently is to look at how additional sources of information – trade, holiday patterns, migration patterns or population structure – could perhaps make volatility more understandable,” says David Marsh. “Then the challenge will be to project those things forward so we will be able to explain this particular increase in tourism in terms of a change in the exchange rate. That’s good, but how do I forecast that exchange rate going forwards?”

While more data will be required to understand the nature of volatility, it is not the volume that will help but the quality and relevance. The key will be to take increasing amounts of data and reduce it to some underlying trends that will generate a base forecast.

“We’re constantly aware of it. In terms of the volume of data, there’s a basic discipline we have to teach analysts when they arrive in STATFOR which is beware of launching analysis on the data,” says David Marsh. “You have to reduce the sample first, otherwise you will just clog up the analysis framework.”

“There’s a lot of evidence that, if you can forecast in two or three different ways from different data and then you combine the output, you get something which is more robust and more accurate,” says Claire Leleu. “That’s exactly what we do in STATFOR. We have this panel approach that we use in the forecast, and I think any AI will need to be part of a panel. Maybe it will be a panel of different AIs, but that’s the way to defeat the monsters and improve the overall system.”

For all seven-year forecasts produced since 1990 by STATFOR the median error is 0.5%, so forecasts have been just slightly low. The corresponding Median Absolute Error per year is just below 3% and the mean relative performance (when compared to a same-as-last-year-growth forecast) is 1.81 which means that STATFOR forecasts have been 81% more accurate than the benchmark.

In 2018 the median error per year was small, at 1.1%, and the median absolute error per year was 2.1%, one of the lowest errors ever recorded by STATFOR. The 2018 forecast was 104% more accurate than the same-as-last-year growth baseline

Read our full Skyway issue

Learn more about the new era for aviation powered by AI and our role in it.