Filter over 6000 values in Kibana

Hey!

I have created a query filter in python and send it to Kibana as a saved object.
The final saved object looks like that:

{
    "attributes": {
        "description": "",
        "filters": [{
            "query": {
                "terms": {
                    "serialnumber.keyword": ["1", "2", ..., "6000" ]
                }
            }
        }],
        "query": {
            "language": "kuery",
            "query": "target.keyword : something*"
        },
        "timefilter": {
            "from": "now-15d",
            "refreshInterval": {
                "pause": true,
                "value": 0
            },
            "to": "now"
        },
        "title": "test query"
    },
    "id": "test query",
    "references": [],
    "type": "query",
    "updated_at": "2021-09-14T09:06:09.737Z",
    "version": "test"
} {
    "exportedCount": 1,
    "missingRefCount": 0,
    "missingReferences": []
}

But I have the following error when I am trying to implement this query in kibana:

Type too_complex_to_determinize_exception

Reason too_complex_to_determinize_exception: Determinizing automaton with 56941 states and 56940 transitions would result in more than 10000 states.

How can I resolve this?

Thank you in advance!

/Angelos

Hello,

What would the purpose of this query be? Usually if you create them in Kibana, they are optimised so that it doesn't hit these issues.
There error by itself is coming from Elasticsearch an I'm expert enough in the queries to figure out exactly where it comes from, although I do suspect that serialnumber.

If you want to achieve only filtering values over a certain number, why not just use lte or gte? You would have to map your field as number instead of keyword, but you can work around that with a scripted field or a runtime field.

Hi @Marius_Dragomir!

Thank you for your fast response.
This query is a result of a visualization that gathers serial numbers.

The idea is to use these serial numbers as filter in different discoveries or dashboards. As far as I know in kibana you cannot use result of a visualization in different visualizations unless we use scripted fields which in our case is impossible since we handle a great amount of data and it will reduce performance.

So what I did is to download all the needed serial numbers and then import them as saved object. In the beginning, I used "should": [{ "match_phrase":{}}] and it worked fine but when the serials exceed the limit of 1024 in the filter I had the error again about max_clause_count. I don't want to increase that yet to avoid performance issues in searches.

So, now I am trying to use query terms as I mentioned in the first post.

/Angelos

That filter is going to cause an inordinate amount of issues since the way that is handled in ES is equivalent to having 6000 different filter and it will slow it down. I would really look into using a different data type for it as it's more suited to Lucene. Feel free to also ask about that in the Elasticsearch part of this forum, as they are better equipped for advice on this.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.