Our new paper “Purely data-driven medium-range weather forecasting achieves comparable skill to physical models at similar resolution” is available now on arXiv: https://arxiv.org/abs/2008.08626
We show that with enough data, a deep-learning based model can actually compete and in some cases outperform established physical models (e.g., IFS forecasts for 210km resolution). We show how such models can be trained based on the WeatherBench data set, that they contain plausible learned structures, and also fare well for challenging fields such precipitation. At the same time, they illustrate that it will be very difficult to increase the performance only with the data that is currently available.
Full abstract: Numerical weather prediction has traditionally been based on physical models of the atmosphere. Recently, however, the rise of deep learning has created increased interest in purely data-driven medium-range weather forecasting with first studies exploring the feasibility of such an approach. Here, we train a significantly larger model than in previous studies to predict geopotential, temperature and precipitation up to 5 days ahead and achieve comparable skill to a physical model run at similar horizontal resolution. Crucially, we pretrain our models on historical climate model output before fine-tuning them on the reanalysis data. We also analyze how the neural network creates its predictions and find that, with some exceptions, it is compatible with physical reasoning. Our results indicate that, given enough training data, data-driven models can compete with physical models. At the same time, there is likely not enough data to scale this approach to the resolutions of current operational models.