
"Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán"
By Andrew J. Charlton-Perez, Helen F. Dacre, Simon Driscoll, Suzanne L. Gray, Ben Harvey, Natalie J. Harvey, Kieran M. R. Hunt, Robert W. Lee, Ranjini Swaminathan, Remy Vandaele & Ambrogio Volonté. Published in partnership with CECCR at King Abdulaziz University, Nature,
DOI: 10.1038/s41612-024-00638-w.
Here are the main takeaways from the paper:• AI models (FourCastNet, Pangu-Weather, GraphCast, FourCastNet-v2) demonstrate strong capabilities in capturing large-scale dynamical drivers vital for rapid storm development, such as the storm's position relative to upper-level jets. They also accurately reproduce the larger synoptic-scale structure of cyclones like Storm Ciarán, including the cloud head's position and the warm sector's shape.
Despite these strengths, AI models consistently underestimate the peak amplitude of winds, both at the surface and in the free atmosphere, associated with storms. They also struggle to resolve detailed structures crucial for issuing severe weather warnings, such as sharp bent-back warm frontal gradients, and show variable success in capturing warm core seclusion.
The underestimation of strong winds is not a consequence of the AI models' output resolution or their training data. This discrepancy persists even when compared against ERA5 (on which these models were trained) and numerical weather prediction (NWP) models of similar resolution, suggesting a more fundamental limitation in their ability to represent intense wind features.
The case study of Storm Ciarán highlights the pressing need for a more comprehensive assessment of machine learning weather forecasts. Moving beyond isolated error metrics to evaluate all relevant spatio-temporal features of physical phenomena is essential for identifying specific areas for improvement and fostering rapid advancements in this new and potentially transformative forecasting tool.