Best way to search a field for a large set of values

I'm creating a visualization in Kibana and want to filter a url.keyword field so that only a subset of about 50 urls are included in the search. In addition, I want to filter everything on a second field field2.keyword:true.

I curently have a query attached to the visualization that looks something like this:

(url.keyword:"http://a.com" OR ... OR url.keyword:"http://z.com") AND field2.keyword:true

This seems to take a long time to execute (although it was much faster on 2.3 with .raw fields) and when I put several of these visualizations together on a dashboard it even causes the JVM heap to overflow and crashes ES. Is there a more efficient way to write this query? I feel like I should be using the json input option instead but I'm not sure it actually queries any differently on the backend.

I appreciate the assistance.

Thanks,
Brandon

You can try a JSON filter like so (just replace ip with url.keyword and your own values):

{
  "query": {
    "bool": {
      "filter": {
        "terms": {
          "ip": [
            "185.231.86.130",
            "166.24.243.98",
            "155.171.113.115"
          ]
        }
      }
    }
  }
}

It does actually send it to es differently:

With the filter:

"query":{"bool":{"must":[{"query_string":{"query":"*","analyze_wildcard":true}},
{"bool":{"filter":{"terms":{"ip":["185.231.86.130","166.24.243.98","155.171.113.115"]}}}},

Using an OR query ip:"185.231.86.130" OR ip:"166.24.243.98" OR ip:"155.171.113.115":

"query":{"bool":{"must":[{"query_string":{"query":"ip:\"185.231.86.130\" OR ip:\"166.24.243.98\" OR ip:\"155.171.113.115\"","analyze_wildcard":true}},

Let me know if that improves performance.

1 Like

Thanks for the response. That's good to know. However, this doesn't seem to work when I add it to the JSON input under the y-axis (count) of a visualization. I want to be able to visualize a date histogram that only displays the count for the specified urls and I think the JSON input in the visualization section may accept different information than the JSON filter in Discover. I'm not sure this is the case, though.

You should be able to do this without using the JSON input in visualize by using pinned filters.

Go into the discover tab and create a filter with the url. I'll walk through my ip case. Hit the little + magnifying glass.

Then to the filter that was added at the top, click the edit button and paste your list query.

Then click the thumbtack to pin this filter.

With the filter pinned, you can go over into the visualize tab and create your visualization that will be filtered to that list.

When you save the visualization, the filter will be saved with it, so whether you are viewing it in a dashboard, or opening it back up, it will retain this filter.

There might be an easier way to achieve this, but this is the first thing that comes to mind.

Let me know if that helps!

Wow! That's awesome, thank you. The ability to pin filters is a powerful one. I was always upset with the fact that I couldn't create filters from the visualization interface but this at least gives me the ability to access and edit them. I will check in again when I have found out if using filters is more efficient than querying a search.

In conclusion, using filters seemed to be a more stable method for searching for a large list of values. My searches are more stable and much more easy to construct in this fashion.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.