Mapping ids to labels with a scripted field

Hi, I am using ES to archive data originally stored in a relational DB, and then provide a Kibana Dashboard. I store "denormalized" data (that is, performing joins at upload time), which of course comes with some caveats.

One of them: for visualization and filtering purposes, data should be aggregated based on ids, but displayed based on labels. This mapping from ids to labels may change with time, and I would always want to use the current map.

In order to avoid having to update previously stored denormalized data (which in my opinion is an awful scenario), I am thinking of controlling the mapping with a scripted field like the one below:

def l = new ArrayList();
Map map = [
'id1':'label1', 
'id2':'label2'
];

def k = doc['myfield.keyword'].value;

if (!map.containsKey(k)) {
    return 'no_label';
} else {
    return map[k];
}

I can produced and update the scripted dynamically with the saved_objects Kibana API.

However, this solution looks like a weak or problematic workaround to me. I suppose that, if at all, I could use an approach like this, with some safety, given that the map is small and very stable: on the order of hundred(s) entries and maybe changed one or twice a month)

You can do the same thing with the static field formatters as part of the index pattern in Kibana.

Just use the "Static Lookup" formatter and you can enter your mapping there:

As a bonus, those won't be sent to Elasticsearch, but applied directly at the visualization level which improves performance.

Thanks Joe, this is interesting.

I saw that the "static_lookup" is easy to update programatically.

Would it be a problem to (ab)use this look-up functionality by adding, say, 4574 key-value pairs?

And where is the mapping actually taking place? In the Kibana server?

The mapping is taking place in the browser client and are saved in the index pattern. So when Kibana is showing a visualization, it will download the index pattern saved object containing the whole mapping and apply it before displaying.

Performance-wise I think ~5k pairs are still manageable, you probably have to increase the server.maxPayloadBytes setting in kibana.yml to be able to save the index pattern. If this number grows to 100k, we should think about a solution within Elasticsearch

Thanks again for your clarifying answer.

Just for my understanding:

  • the solution with the scripted field dynamically updated with, say, with a scheduler, will perform the mapping within the ES server, right? With this approach would you see still a problem working with 5k value pairs? or 50k value pairs? I feels a bit hacky to me: that will be a very long script...

  • In your answer, when you refer to 100k, is it key-value pairs or server.maxPayloadBytes?

the solution with the scripted field dynamically updated with, say, with a scheduler, will perform the mapping within the ES server, right? With this approach would you see still a problem working with 5k value pairs? or 50k value pairs? I feels a bit hacky to me: that will be a very long script...

I wouldn't recommend the script approach because the script is not persisted within Elasticsearch, but sent to the server with each individual request made from Kibana. For a lot of key-value pairs this would impact performance a lot because it has to upload possibly megabytes of key-value pairs for each request each visualization is issuing. Field formatters seem like the best option for your use case.

  • In your answer, when you refer to 100k, is it key-value pairs or server.maxPayloadBytes ?

That was referring to key-value pairs

For best performance I would recommend putting the value into the documents within Elasticsearch via the "update_by_query" API: Update By Query API | Elasticsearch Guide [8.11] | Elastic

Then Kibana doesn't have to know about the lookup and can just use the nice format. But of course that approach comes with its own downsides.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.