I love that I can grab wx observations in near real time by grabbing the UDP broadcasts off the hub, but one thing that bugs me about it is the formatting of the events. The overall structure is JSON, but with CSV embedded. This is especially aggravating with the timestamp fields, where with some event types timestamp is one of the JSON formatted fields, and with others it’s one of the fields in the CSV string.
Life would be a whole lot better if everything was a JSON formatted field.
Hi Eric. It’s true that full name/value pairs can make debugging your JSON parser easier. But we have chosen to use JSON arrays in certain cases in the name of efficiency. This is not a big deal in the individual observations coming over the UDP broadcast, but there are REST calls that can return hundreds or thousands of observations in a single response and we wanted to use the same message format in both cases - that’s why device observations are usually represented as arrays. If each field name were repeated for each observation, those responses would be incredibly bloated.
I’m using Splunk, so it’s a bit different. Basically, I ingest the data from the source and tell Splunk what the structure is. I can’t tell Splunk, “This is JSON data, but some of it might be CSV.” I have to tell it it’s JSON and let it do the field extractions it’s able to, then manually create field extractions for the array data.
Yes, you will need to keep up with the format of the data per the UDP API and do whatever you need to do to pick the right piece out of the observation’s array of data elements. Same thing Gary needs to do in javascript. Same thing several of us need to do in our python code. It’s not quite a JSON-formatted simple key/value pair kind of thing. Some (dis)assembly required.
I know nothing of Splunk or what language it used. Having arrays in JSON is normal. What you refer to as CSV is actually an indexed array. I’ve never heard anyone refer to that as CSV.
Splunk is a data platform, not a language. It has its own language for searching the data that goes into it, but when data is ingested in Splunk, you have to give it some hints of how it’s structured. The JSON indexed arrays are basically comma-separated values (CSV) and that’s one data structure to Splunk, while JSON is a completely different data structure.
I think there are functions in Splunk to extract fields from JSON arrays, but I haven’t dug into it.
The free license used to index up to 500 MB/day so that’s easily enough to handle a WF system or ten. If you get a developer (free) license that ups the ante to even more, but you have to renew yearly which is easy to do online.