Machine learning for total cloud cover prediction

Abstract

Accurate and reliable forecasting of total cloud cover (TCC) is vital formany areas such as astronomy, energy demand and production, or agriculture.Most meteorological centres issue ensemble forecasts of TCC, however, theseforecasts are often uncalibrated and exhibit worse forecast skill than ensembleforecasts of other weather variables. Hence, some form of post-processing isstrongly required to improve predictive performance. As TCC observations areusually reported on a discrete scale taking just nine different values calledoktas, statistical calibration of TCC ensemble forecasts can be considered aclassification problem with outputs given by the probabilities of the oktas.This is a classical area where machine learning methods are applied. Weinvestigate the performance of post-processing using multilayer perceptron(MLP) neural networks, gradient boosting machines (GBM) and random forest (RF)methods. Based on the European Centre for Medium-Range Weather Forecasts globalTCC ensemble forecasts for 2002-2014 we compare these approaches with theproportional odds logistic regression (POLR) and multiclass logistic regression(MLR) models, as well as the raw TCC ensemble forecasts. We further assesswhether improvements in forecast skill can be obtained by incorporatingensemble forecasts of precipitation as additional predictor. Compared to theraw ensemble, all calibration methods result in a significant improvement inforecast skill. RF models provide the smallest increase in predictiveperformance, while MLP, POLR and GBM approaches perform best. The use ofprecipitation forecast data leads to further improvements in forecast skill andexcept for very short lead times the extended MLP model shows the best overallperformance.

Quick Read (beta)

loading the full paper ...