UDP event formatting

,

I love that I can grab wx observations in near real time by grabbing the UDP broadcasts off the hub, but one thing that bugs me about it is the formatting of the events. The overall structure is JSON, but with CSV embedded. This is especially aggravating with the timestamp fields, where with some event types timestamp is one of the JSON formatted fields, and with others it’s one of the fields in the CSV string.

Life would be a whole lot better if everything was a JSON formatted field.

Hi Eric. It’s true that full name/value pairs can make debugging your JSON parser easier. But we have chosen to use JSON arrays in certain cases in the name of efficiency. This is not a big deal in the individual observations coming over the UDP broadcast, but there are REST calls that can return hundreds or thousands of observations in a single response and we wanted to use the same message format in both cases - that’s why device observations are usually represented as arrays. If each field name were repeated for each observation, those responses would be incredibly bloated.

image

6 Likes

What do you mean by CSV embedded? Are you referring to the Array of values?

Yes, exactly that. Key/value pairs are so much easier to deal with.

2 Likes

Yup, I totally get it. It’s just that I’m a lazy person at heart and like my data delivered ready to eat. :grin:

1 Like

Easier? I wrote the code once and it’s done. I even add a bit of code to alert me if the array size changed.

I use JavaScript. What language are you using?

1 Like

I’m using Splunk, so it’s a bit different. Basically, I ingest the data from the source and tell Splunk what the structure is. I can’t tell Splunk, “This is JSON data, but some of it might be CSV.” I have to tell it it’s JSON and let it do the field extractions it’s able to, then manually create field extractions for the array data.

1 Like

Yes, you will need to keep up with the format of the data per the UDP API and do whatever you need to do to pick the right piece out of the observation’s array of data elements. Same thing Gary needs to do in javascript. Same thing several of us need to do in our python code. It’s not quite a JSON-formatted simple key/value pair kind of thing. Some (dis)assembly required.

2 Likes

I know nothing of Splunk or what language it used. Having arrays in JSON is normal. What you refer to as CSV is actually an indexed array. I’ve never heard anyone refer to that as CSV.

Splunk is a data platform, not a language. It has its own language for searching the data that goes into it, but when data is ingested in Splunk, you have to give it some hints of how it’s structured. The JSON indexed arrays are basically comma-separated values (CSV) and that’s one data structure to Splunk, while JSON is a completely different data structure.

I think there are functions in Splunk to extract fields from JSON arrays, but I haven’t dug into it.

1 Like

It’s really simple in JavaScript. That’s the language in which I wrote ArchiveSW.

1 Like

Dig into it. Splunk can decipher anything, if you tell it how to do so.

(disclaimer - did Splunk for a living for 2+ years at former $work)

4 Likes

Nice, I used to admin and do light console development on a Splunk cluster in a past life. Love the product, wish licenses weren’t so dang expensive.

1 Like

The free license used to index up to 500 MB/day so that’s easily enough to handle a WF system or ten. If you get a developer (free) license that ups the ante to even more, but you have to renew yearly which is easy to do online.

1 Like