Create a data table with columns based off of key value in metric


#1

I have data set which has a number of key value pairs with each pair being a metric. so for example I may have:

<component = A, key = "foo" value = "hello">
<component = A, key = "Bar" value = "World">
<component = B, key = "foo" value = "Goodbye"> ...

I want to create a datatable with one row per unique component which will contain the columns "foo" and "bar"

I know how to use split rows on the 'component' term to create a row per component. Likewise I can do a request for the most recent value of a column by using Top Hit aggregation sorted by timestamp.

However, I'm not sure how to create a column named 'foo' or 'bar'. It seems like I need to filter each column individual, so the foo column can filter where key == 'foo' and bar column can filter where key == 'bar'. That or some way to split columns like I split rows.

Is there a means to do this in Kirbana?


(Joe Fleming) #2

The filter agg might do what you're looking for, but only if you know what keys you care about ahead of time (since the values are all static and manually entered).

Pretty sure I'm wrong. I don't really follow what your data looks like and how you're trying to look at it. In your example, is each "component" a row/document?


#3

Yes, each is it's own document, with one document per component/key/value pair. This is actually a simplification of the actual document, where component is a property and a number of other properties exist.


(Joe Fleming) #4

OK, this is probably a data structure question then. Think of your data in terms of a static schema, where every document has the same schema. Columns in the resulting datatable will map pretty much directly to the fields in the documents. Any NoSQL mental model will fit here, just know that records across indices is not related and can't be joined, which means you'll want to denormalize your data and ingest it that way.

So in your case, you'd want to have both "foo" and "bar" fields in the mapping, and just index documents with sparse data (ie. sometimes foo is empty, sometimes bar is empty). Note that when youquery or filter on foo or bar, any record that is missing that field will be excluded.

The other way to do this is via a "type" field, where you set the value to "foo" or "bar" (or anything else). Then you have a field you an aggregate on, and can lay out your datatable that way. In that case, you'd end up with a "type" column that either has a foo or bar value.

If you can give me a little more insight into what the data is and what you're trying to see in it, I can try to give you a recommendation and other insight.


#5

I tried to the sparse data set suggestion, but it doesn't appear to do what I want. When I try to create Top Hit metric on 'properties.foo' what I usually get is a a column full of '-'. Some playing around suggests that Top Hit is taking the latest document without regard to rather properties.foo exists, and since it's a sparse data set the majority of the time properties.foo does not exist and I get the empty result printed. Is there a way to tell it to do Top Hit by timestamp but only look at those documents that contain properties.foo?

I tried not to bore you too much with the full data, but if it will help I'll go into the details here, in case you can suggest a better approach.

I'm trying to create a generic tool which will read and record properties of other applications, (it uses rest calls to the application to get properties). The applications I want to record are all written in a tool called Redhawk, for software defined radio.

I'm trying to record state of will have a number of components, which each component having an arbitrary number of properties represented by a key-value pair. Each application will have it's own properties, I'm hoping my tool for recording properties can be mostly agnostic to the actual application, it will take a list of desired properties to record in a conf file and send those properties to our ELK stack, someone who understands the application can then create the dashboard.

Anyways properties can have a simple (string or int) type for the value, or a 'struct', which is basically just a sub-object with it's own key-value pairs, but where the value must always b ea simple type (so no struct containing another struct), or a List of either simple types or structs.

I want to create a data table that will show me the most recent value of certain properties relevant to watching the status of the application, so for example for each signal what frequency is it, is the modem currently locked on the signal, what is the signal to noise ratio etc.

I originally recorded the value of simple type properties in 'value' with 'key' used to define name of they key, and other properties specifying rather we were looking at the propererties of a struct or list. But that lead to the original problem of not being able to properly create a table of the most recent state since I had no way to only show properties with a given 'key' value.

I've modified the way I generate my XML document I feed into elk now so I also create a property field with the name of the property, so a property.foo or property.bar or even property.foo.bar for a struct will exist with the value equal to the value of the property, in addition to the key & value options from above. However, I still have the problem that I can not easily create a table with this sparse data set since I usually get back an empty string using top hits.

I'm able to modify both the data table itself and/or the XML document I generate to represent the properties. However, I can't figure out a way to do this that makes it easy for someone to create the sort of data table I mentioned.