Filtering documents to any with matching strings from keyword list


I'm currently working on procurement data relevant to the current COVID-19 crises. Essentially i had a block of data that i'd like to be filtered to be relevant to COVID via a keyword list.

If a document does not contain one of these keywords or key phrases in any value then i'd like it to be excluded from my data set.

Whats the best way to do this? I have been looking into elasticsearch filtering but it seems you have to specify the field you're filtering by? using the KQL search bar seems to crash with any more than a few dozen OR terms.

Any help appreciated!

Hi @twright8,

Assuming you want to explore your data in Kibana's Discover:

You could switch KQL to Lucene (right to the query input).
Then the query could be a space separated list of your keywords:
"keyword1 keyword2 keyword3".

Your documents will be filtered by those keywords and only documents which include at least one of those keywords in any of searchable fields will be shown.

Hope this helps

Thanks I will try!

Any help with this? thanks i really appreciate it. An error:

type":"illegal_argument_exception","reason":"The length of regex [1005] used in the [query_string] has exceeded the allowed maximum of [1000]. This maximum can be set by changing the [index.max_regex_length] index level setting."}}}]},"status":400

Wow, that is a lot of keywords :slight_smile:

As error says, the default limit is 1000, but it is possible to increase the limit

To do this in kibana:
Management -> Elasticsearch -> Index Management -> Pick Index, Manage -> Edit index settings -> Edit settings -> add there "index.max_regex_length": "<new value more then 1000> -> Save

AMAZING thankyou!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.