Weatherflow "Continuous Learning" Algorithm

@dsj and other WF staff, hoping for some clarity here…

Is any information available about the “continuous learning” beta algorithm used to correct errors in data from various sensors, namely the humidity and haptic rain data? I (and others) am curious how such a machine learning process works to provide calibrated sensor data. By its nature, calibration of an instrument requires access to an accurate reference source. How does a ML algorithm perform such an activity without a reference source? Is there a particular type of error being introduced with these sensors that the algorithm can detect and remove? Is the algorithm just taking data from nearby Weather Underground PWS sites and developing a “fudge factor” that gets applied to the readings? Can you help us build some confidence in our data?

1 Like

I suspect it might be using model data for your location

Hi @pswired Good question! Our “continuous learning” system is actually more than just one algorithm. It’s an ongoing process that performs a set of QC analyses and applies calibration corrections when necessary. Some of the QC steps do not require a reference source. But you’re correct that many steps in the process do need to compare data from your station to one or more reference sources (aka “trusted sources”). Trusted sources currently include gridded analysis products, physical models, mesoscale numerical models and aggregated weather station observations.

For a parameter like UV or RH, the basic step of creating a calibration curve doesn’t actually include any machine learning. For most parameters, that’s a simple line-fitting process (aka “fudge factor”). The AI/ML algorithms kick in when determining which data to use in building the calibration curve. That’s part of the secret sauce that’s continuously evolving and improving.

The haptic rain sensor is by far our biggest challenge as no home weather station before has had a sensor quite like this, and where AI/ML is playing a larger role. While every SKY out of the box is great at detecting rain and measuring relative intensities, we learned during the field test that it takes a lot more than a simple fudge factor (or simple curve fit) to translate this to a rain amount for an individual SKY. The relationship between rain rate and sensor signal is a function of many factors, and we’re still learning the nuances of how they all play together. While there is still a ways to go, we are very excited about the prospects for improving the process based on what we have already learned. So excited, in fact, that our staff will be presenting new findings at the next annual meeting of the American Meteorological Society in January 2019.

And remember, we’ve been looking at weather data for many years. We’re not just throwing numbers into a black box (although that works sometimes!). We’re leaning on our decades of meteorological expertise to inform the machine learning algorithms. It takes a lot of genius to create the first and only weather stations that get smarter over time.

In addition to ongoing automated calibration, our network systems will employ machine learning to better understand how nearly infinite weather variables affect each other. For example: the effects of temp or wind on rain accumulation. All of this will lead to improvements in both measurement and location-specific forecasting. More good things to come.

  • Do you expect to ever get to ‘done’ such that your firmware (or equivalent) is accurate and finalized out of the box so no tuning/adjusting is needed ?

  • Is each unit individually tuned, or does each unit get a globally calculated ‘best algorithm results’ periodically ?

  • If we’re going to get a never-ending (?) number of periodic updates, how do we know which version we’re currently running ? Will we ever need to do anything to keep accurate (or get more accurate) ? What if we want to choose to not connect to your servers and go local-only or off the internet completely ?

Guess my other concern is what happens when you guys are plagued by success and you have more than a couple thousand units in the field. I know there’s a lot of touch labor and hand-holding early on, but that’s not a sustainable business model…

1 Like

2 posts were merged into an existing topic: Rain Gauge Reading - Wrong

Hi Vince.

Yes, that’s the goal!

Each unit leaves the factory with a (hopefully) good calibration, then each unit is individually and regularly QC’d by the CL process. The calibration of individual sensors is adjusted if the CL process determines it’s necessary.

Everyone will have the same “version” running. We do intend to make the progress and status of the CL process (and any calibration adjustments to your station’s sensors) visible to users through the app.

In general you won’t need to do anything - it will just work. Of course, proper siting is always important. In some instances, the CL process may trigger suggestions. For example, the CL process can flag when an AIR appears to be in direct sunlight during certain parts of the day. We hope to use this information to alert the user to conditions like that.

The CL process won’t work in that case, so you’ll be on your own. We do intend to enable you to override the CL process and set your own calibrations through the app.

That would be a good problem to have! We will aim to be 100% hands-off, eventually, but you are correct that there will continue to be a lot of “human in the loop” effort, especially early on. We’re actually excited about that. The data analysis and the developing CL process is really interesting, and as far as we know, unique among weather stations. We hope to have time to file some patents along the way, but our primary goal is delivering the most accurate weather data possible.


A post was merged into an existing topic: Rain Gauge Reading - Wrong

A post was merged into an existing topic: UV seems too high or too low

Think your sky needs special attention from @WFmarketing UV of 12 is rather strange.


So far, these options haven’t come into fruition yet. Whenever they do, could you add a CL detailed report of what it did and didn’t do? For example, let’s say the CL added a +2% offset above 90% and KEET was the station reference along with time and date of occurrence. Could there be an option to customize the reference stations that CL uses and for what parameter? An option to disable a CL calibration for a particular parameter? For example, if I thought the humidity sensor is performing well out of the box, I could tell CL not to calibrate it until I thought the sensor needs it.

Yes, we would like to surface some of the CL analysis to users at some point and give users some control, but it’s not as simple as “XYZ station was used to calibrate your station”. It’s a pretty complex process that is evolving. You can already opt out of CL if you are happy with your current calibration - just drop a note to support.