Google's New Hurricane Model Was Breathtakingly Good This Season (arstechnica.com) 43
An anonymous reader quotes a report from Ars Technica: Although Google DeepMind's Weather Lab only started releasing cyclone track forecasts in June, the company's AI forecasting service performed exceptionally well. By contrast, the Global Forecast System model, operated by the US National Weather Service and is based on traditional physics and runs on powerful supercomputers, performed abysmally. The official data comparing forecast model performance will not be published by the National Hurricane Center for a few months. However, Brian McNoldy, a senior researcher at the University of Miami, has already done some preliminary number crunching.
The results are stunning: A little help in reading the graphic is in order. This chart sums up the track forecast accuracy for all 13 named storms in the Atlantic Basin this season, measuring the mean position error at various hours in the forecast, from 0 to 120 hours (five days). On this chart, the lower a line is, the better a model has performed. The dotted black line shows the average forecast error for official forecasts from the 2022 to 2024 seasons. What jumps out is that the United States' premier global model, the GFS (denoted here as AVNI), is by far the worst-performing model. Meanwhile, at the bottom of the chart, in maroon, is the Google DeepMind model (GDMI), performing the best at nearly all forecast hours.
The difference in errors between the US GFS model and Google's DeepMind is remarkable. At five days, the Google forecast had an error of 165 nautical miles compared to 360 nautical miles for the GFS model, more than twice as bad. This is the kind of error that causes forecasters to completely disregard one model in favor of another. But there's more. Google's model was so good that it regularly beat the official forecast from the National Hurricane Center (OFCL), which is produced by human experts looking at a broad array of model data. The AI-based model also beat highly regarded "consensus models," including the TVCN and HCCA products. For more information on various models and their designations, see here.
The results are stunning: A little help in reading the graphic is in order. This chart sums up the track forecast accuracy for all 13 named storms in the Atlantic Basin this season, measuring the mean position error at various hours in the forecast, from 0 to 120 hours (five days). On this chart, the lower a line is, the better a model has performed. The dotted black line shows the average forecast error for official forecasts from the 2022 to 2024 seasons. What jumps out is that the United States' premier global model, the GFS (denoted here as AVNI), is by far the worst-performing model. Meanwhile, at the bottom of the chart, in maroon, is the Google DeepMind model (GDMI), performing the best at nearly all forecast hours.
The difference in errors between the US GFS model and Google's DeepMind is remarkable. At five days, the Google forecast had an error of 165 nautical miles compared to 360 nautical miles for the GFS model, more than twice as bad. This is the kind of error that causes forecasters to completely disregard one model in favor of another. But there's more. Google's model was so good that it regularly beat the official forecast from the National Hurricane Center (OFCL), which is produced by human experts looking at a broad array of model data. The AI-based model also beat highly regarded "consensus models," including the TVCN and HCCA products. For more information on various models and their designations, see here.
Re: (Score:2)
Why do you say that? TV forecasts are averages over a region. That's by design. Do you really want the TV weather presenter to rattle off rain and temperature and humidity numbers for every block in every city in the whole state you live in? No, they just give indicative data. It's not going to apply to you. It's a guideline.
If you want to know if you're going to be rained on, go get a radar image map with rain data updated every minute, and work out if the clouds will cross your street at the exact momen
Re:If it is half as good as weathernews, count me (Score:4, Informative)
Do you really want the TV weather presenter to rattle off rain and temperature and humidity numbers for every block in every city in the whole state you live in?
That's why I use the Met Office app or website in the UK. It's really nice to have very granular forecasting of rain.
If you want to know if you're going to be rained on, go get a radar image map with rain data updated every minute, and work out if the clouds will cross your street at the exact moment when you will be standing on it.
Or use the maps from forecasters who do that for you.
Re: (Score:2)
Re: (Score:2)
The forecasters don't know where you'll be when it matters.
The good ones produce maps, which are basically like rain radar maps extrapolated forwards in time.
WayDumberThanDirt (Score:1)
WayDumberThanDirt told me Trump was banning hurricanes and increasing tariffs 20% on any countries that reported on them.
Stop The Science !
Re: (Score:1)
retard.
Re: (Score:2)
Trouble is that this is believable.
Breathtakingly bad (Score:1)
The fact that the linked graph requires a paragraph-long explanation indicates that it is breathtakingly bad.
Re: (Score:3)
The second sentence explains it perfectly fine to a layperson, and anyone used to interpreting plots has no real problem even without the supplied explanation. The only points of confusion for someone unfamiliar with the different models are the are the labels.
Re: (Score:2)
Ars Technica Clickbait or Useful? (Score:5, Informative)
Re: (Score:1)
Agreed. I came here to say the same thing. Hurricane tracks can be fairly predictable but there are some really wild ones that have a mind of their own. I don't think any of the models could predict those mentioned in the Weather Channel's "Strangest Hurricanes" collection. https://weather.com/storms/hur... [weather.com]
Re: (Score:2)
The site seems to be down, but I would hope any academic evaluating this model would have tested it with historic data as well.
Re: (Score:3)
Good on Google (seriously), however (Score:5, Insightful)
They really should've compared it to the European ECMWF. The US's GFS model has fallen way behind the ECMWF in all sorts of ways, and the US government doesn't seem inclined to provide adequate funding to remedy that.
Note that this is not intended as an anti-Trump post; this has been a longer-term issue over mutliple administrations, both Democratic and Republican.
Re: (Score:1)
Google probably just pillaged from ECMWF
Re:Good on Google (seriously), however (Score:5, Funny)
Maybe they taught DeepMind how to look up the ECMWF forecasts / predictions!
Re: (Score:2)
They may not do tracks for Atlantic cyclones or something.
Re: (Score:1)
Note that this is not intended as an anti-Trump post
To be fair, the Trump administration did cut the budget of NOAA and the NWS. [science.org]
Re: (Score:2)
This early model comparison does not include the “gold standard” traditional, physics-based model produced by the European Centre for Medium-Range Weather Forecasts. However, the ECMWF model typically does not do better on hurricane track forecasts than the hurricane center or consensus models, which weigh several different model outputs. So it is unlikely to be superior to Google’s DeepMind.
Re: Good on Google (seriously), however (Score:2)
So, does physics predict less well than AI?
Re: (Score:2)
They really should've compared it to the European ECMWF. The US's GFS model has fallen way behind the ECMWF in all sorts of ways, and the US government doesn't seem inclined to provide adequate funding to remedy that.
Note that this is not intended as an anti-Trump post; this has been a longer-term issue over mutliple administrations, both Democratic and Republican.
Yeah, the US model is well known to be worse than the European models overall, but sometimes better in select scenarios. Typically I hear forecasters mixing predictions from the various models based on circumstance.
But even ignoring that... I'd expect that a trained model like Googles would dominate a relatively tepid hurricane season. What will be interesting is if it does a better job on an extremely atypical, high impact (landfall) storm system, and can replicate that repeatedly. In general in modeling (
Models improvement has been amazing (Score:4, Interesting)
Anyone bashing the GFS is free to improve it - the code is on github, Fortran required. Replicating the data assimilation network is an exercise for the reader.
Re: Models improvement has been amazing (Score:4, Informative)
Weather prediction has actually gotten quite good. But your weatherman is not going to give you the real weather prediction because it's confusing and highly probabilistic and because of set expectations.
For example, if the data says the chance of rain is 50% the weatherman will tell you it's 75%. Because they've learned viewers interpret 50% as "we have no idea, it's a coin toss, your guess is as good as mine". And because they've learned to err on the side of rain so the viewers don't get angry when they plan a beach day.
Additional complexities might be things like 34% chance of rain if temperatures hit the expected high of 67, but if the wind turns and the temp drops, the chance of rain climbs to 43%. If it rains, the temperature tomorrow should be adjusted down 1.5 degrees for the first six hours, then reach previous expected temps around 2pm.
Too much information.
Re: (Score:2)
Re: (Score:2)
First you have to learn about the various models available and what their limitations are. TropicalTidbits.com has many different charts under "Forecast Models" and windy.com allows you to choose from a few standard models and does a good job visualizing the data. To go deeper you can use a GRIB viewer app like LuckGrib (MacOS/iOS only - there are other viewers out there) which can download subsets of model data from its own servers. Or you can download parts of the models yourself from NOAA's NOMADS reposi
Re: (Score:2)
I'm not hurricane expert (Score:1)
But a season that mostly consists of fish storms seems like not the most challenging task for a prediction model. Who cares what the margin for error is when we're talking hundreds of miles of open ocean?
The real test will be when we have another season with storms that actually end up heading towards the east coast.
Re: I'm not hurricane expert (Score:2)
Do container ships carrying your Amazon purchases care?
Re: (Score:2)
Do container ships carrying your Amazon purchases care?
It's right there in TFS - this was about the Atlantic hurricane season. You might want to look on a map to see where China is. Just sayin'.
Re: I'm not hurricane expert (Score:2)
"Atlantic hurricane season (June 1 to November 30) affects shipping routes by causing delays, rerouting, and increased costs due to port closures, road and rail damage, and dangerous storm conditions. [...]
Most affected areas
U.S. Atlantic and Gulf Coasts: States like Florida, Louisiana, Texas, North and South Carolina, and Georgia are particularly vulnerable."
Humans not so bad (Score:1)
"Google's model was so good that it regularly beat the official forecast from the National Hurricane Center (OFCL), which is produced by human experts looking at a broad array of model data."
This tells me humans do a decent job. Given humans' performance I doubt if it is worth all that comes with GDMI (which may indirectly be causing hurricanes itself).
I Think It's Fantastic (Score:2)
I think the current predictions seem to be remarkably accurate 5-7 days out. It has been a noticeable and dramatic improvement in the last ~5 years.
If Google's models turn out to be even more accurate, that is simply fantastic!
Average track position (Score:2)
Re: (Score:2)
Instead of just the average track error (the dotted black line), I'd be interested in the error of the average track position. In other words, get the track position at each timestamp for all models, average that, then determine the error.
You're describing the consensus models, and they are better than any individual model, at least thus far.
No matter how good, (Score:2)
Unsure (Score:2)
Ideal use case for AI (Score:2)
AI is, at its core, a sophisticated pattern recognition system. It's really good at digesting tons of inputs and spotting patterns. Hurricane prediction is exactly the kind of thing these AI models *should* be good at, given the right kinds of input data, and enough of it.