Handling tags in elasticsearch documents and kibana visualisations

tommycahir · December 11, 2024, 7:34pm

Hey folks

I have been asked to determine how we can leverage a set of tags that will be added to all documents being indexed in future and I am looking for guidance on best practices.

Our clusters are primarily used as analytics stores and we must be able to filter the dashboard in a use friendly way by the tag values (ideally with a control filter with parent for tag value being the tag key).

The tags will look like the below with a key and value pair, I have looked at indexing these as nested, flattened and objects but nested and

    "tags": [
      {
        "key": "location",
        "value": "Ireland"
      },
      {
        "key": "location",
        "value": "England"
      },
      {
        "key": "lifecycle",
        "value": "to_be_sunset"
      }
    ],

so far the best compromise I can get to is to concatenate the key value pair eg "location-Ireland","location-England","lifecycle-to_be_sunset"

then my next question is whether its better to index as
"tags": ["location-Ireland","location-England","lifecycle-to_be_sunset"],
or
"tags": [{"value":"location-Ireland"},{"value":"location-England"},{"value":"lifecycle-to_be_sunset"}],

I'm open to all feedback and suggestions but key is to have it usable via Kibana for filtering.

strawgate · December 11, 2024, 7:39pm

I would strongly recommend avoiding using tags for content that is ultimately key-value .

In Elastic Common Schema, tags is an array of strings (keywords):

tags
List of keywords used to tag each event.
type: keyword
Note: this field should contain an array of values.
example: ["production", "env2"]

By using tags as a key-value store, you'll be in conflict with the schema and will have a hard time using elastic-native data sources.

Instead of using tags, I would recommend creating a normal object and using your keys like location as a key and your key like lifecycle as a real key. You would then add this to each object under something like a MyOrg key. So to each document you would add:

"MyOrg" : {
   "location": "{city}",
   "lifecycle": "to_be_sunset"
}

This will make each one of these a full field in Kibana that enables filtering, sorting, dashboard controls, and more. You will also avoid some of the headaches that come with dealing with multi-value types when you start using these in KQL and ES|QL!

Christian_Dahlqvist · December 11, 2024, 7:39pm

Kibana has traditionally not had great support for nested documents (not sure if this has changed recently) so having an array with key-value objects is likely not going to work well. If there are a limited number of possible keys I would probably nest these:

"tags": {
    "location": ["England","Ireland"]
}

If there are a large number of possibilities your example where you concatenated key and value is IMHO probably the best.

tommycahir · December 12, 2024, 7:09am

Thanks for the feedback but the only issue we have with this approach is that (as per example) the key part of the tag may be repeated multiple times.

tommycahir · December 12, 2024, 7:11am

This is one option I was thinking about overnight and will be testing this morning, we might still have some limitations on how we visualise that data

tommycahir · December 12, 2024, 7:12am

Maybe we can have a compromise

"tags" : {
   "location": ["England","Ireland"],
   "lifecycle": "to_be_sunset"
}

again to be tested this morning

Christian_Dahlqvist · December 12, 2024, 7:14am

If you are using tags for filtering and not aggregations (?) it may be worthwhile looking at using the nested example I showed together with the flattened field type. Am not sure how well this does or does not work with Kibana though.

Christian_Dahlqvist · December 12, 2024, 7:16am

This would work for a limited number of tag types, but be aware of the risk of mapping explosion if the number is large.

tommycahir · December 12, 2024, 7:20am

Nested and flattened won't work with kibana lens visualisations at all, I have tested and the lens tool will not see the "tags" field at all.

I am very conscious of mapping explosion and we will have to put some limit on the volume of tags being applied

strawgate · December 12, 2024, 3:06pm

If the values can be an array doesn't that address that problem?

tommycahir · December 12, 2024, 5:30pm

The problem is that the business teams want to maintain that key value pair relationship and the value alone can be interpreted differently based on the key (given context)

Given all the options above and having tested them and compared to the potential volume of keys, risk of mapping explosion etc The only option that seems to work for me is when I concatenate the key and value pair:

"tags": ["location-Ireland","location-England","lifecycle-to_be_sunset"],

Topic		Replies	Views
Dealing with many unique fields Elasticsearch	3	705	March 4, 2019
Whats the best format to store "tags" with values? Elasticsearch	1	596	July 5, 2017
Visualizing multi-valued field in kibana using terms aggregation Kibana	2	807	November 21, 2018
How do I query for a tag in Kibana? Kibana	12	14237	March 9, 2018
Object mapping for [tags] tried to parse field [tags] as object, but found a concrete value Kibana ilm-index-lifecycle-management	1	362	April 30, 2022

Handling tags in elasticsearch documents and kibana visualisations

Related topics