Handling tags in elasticsearch documents and kibana visualisations

Hey folks

I have been asked to determine how we can leverage a set of tags that will be added to all documents being indexed in future and I am looking for guidance on best practices.

Our clusters are primarily used as analytics stores and we must be able to filter the dashboard in a use friendly way by the tag values (ideally with a control filter with parent for tag value being the tag key).

The tags will look like the below with a key and value pair, I have looked at indexing these as nested, flattened and objects but nested and

    "tags": [
      {
        "key": "location",
        "value": "Ireland"
      },
      {
        "key": "location",
        "value": "England"
      },
      {
        "key": "lifecycle",
        "value": "to_be_sunset"
      }
    ],

so far the best compromise I can get to is to concatenate the key value pair eg "location-Ireland","location-England","lifecycle-to_be_sunset"

then my next question is whether its better to index as
"tags": ["location-Ireland","location-England","lifecycle-to_be_sunset"],
or
"tags": [{"value":"location-Ireland"},{"value":"location-England"},{"value":"lifecycle-to_be_sunset"}],

I'm open to all feedback and suggestions but key is to have it usable via Kibana for filtering.

I would strongly recommend avoiding using tags for content that is ultimately key-value .

In Elastic Common Schema, tags is an array of strings (keywords):

tags
List of keywords used to tag each event.
type: keyword
Note: this field should contain an array of values.
example: ["production", "env2"]

By using tags as a key-value store, you'll be in conflict with the schema and will have a hard time using elastic-native data sources.

Instead of using tags, I would recommend creating a normal object and using your keys like location as a key and your key like lifecycle as a real key. You would then add this to each object under something like a MyOrg key. So to each document you would add:

"MyOrg" : {
   "location": "{city}",
   "lifecycle": "to_be_sunset"
}

This will make each one of these a full field in Kibana that enables filtering, sorting, dashboard controls, and more. You will also avoid some of the headaches that come with dealing with multi-value types when you start using these in KQL and ES|QL!

Kibana has traditionally not had great support for nested documents (not sure if this has changed recently) so having an array with key-value objects is likely not going to work well. If there are a limited number of possible keys I would probably nest these:

"tags": {
    "location": ["England","Ireland"]
}

If there are a large number of possibilities your example where you concatenated key and value is IMHO probably the best.

1 Like

Thanks for the feedback but the only issue we have with this approach is that (as per example) the key part of the tag may be repeated multiple times.

This is one option I was thinking about overnight and will be testing this morning, we might still have some limitations on how we visualise that data

Maybe we can have a compromise

"tags" : {
   "location": ["England","Ireland"],
   "lifecycle": "to_be_sunset"
}

again to be tested this morning

If you are using tags for filtering and not aggregations (?) it may be worthwhile looking at using the nested example I showed together with the flattened field type. Am not sure how well this does or does not work with Kibana though.

This would work for a limited number of tag types, but be aware of the risk of mapping explosion if the number is large.

Nested and flattened won't work with kibana lens visualisations at all, I have tested and the lens tool will not see the "tags" field at all.

I am very conscious of mapping explosion and we will have to put some limit on the volume of tags being applied

1 Like

If the values can be an array doesn't that address that problem?

The problem is that the business teams want to maintain that key value pair relationship and the value alone can be interpreted differently based on the key (given context)

Given all the options above and having tested them and compared to the potential volume of keys, risk of mapping explosion etc The only option that seems to work for me is when I concatenate the key and value pair:

"tags": ["location-Ireland","location-England","lifecycle-to_be_sunset"],

1 Like