Managing queries in Elasticsearch Logstash filter plugin

Hi,

I'm opening this topic as a follow-up to this one: Elasticsearch query sort order index
My main documentation reference for using Elasticsearch queries in Logstash is https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html

What I would like to understand, is how to manage queries templates to control search results.
Here's an example: let's assume we have three daily indices:

  • logstash-data-2018.07.27
  • logstash-data-2018.07.26
  • logstash-data-2018.07.25

If we use this filter here:

filter {
    elasticsearch {
	    hosts => ["elasticsearch:9200"]
	    index => ["logstash-data-*"]
	    query => "object:%{[data_object]}"
	    result_size => 1
        fields => {"some_field_in_logstash-data" => "some_field"}
    }
}

If data_object = 12345678, from my understanding, Logstash is using this query template here:

{
  "query": {
    "match": {
      "object": {
        "query": "12345678",
        "type": "phrase"
      }
    }
  }
}

So, if logstash-data-* has multiple entries like for instance: 12345678, 123456789, 12345678A
All of them will match and Logstash will simply take the first result.

What I would like to achieve is for Logstash to look for the exact match.
Is it possible to achieve this by using this query template here?

{
  "query": {
    "match": {
      "object.keyword": {
        "query": "12345678",
        "type": "phrase"
      }
    }
  }
}

And where I should put this template in order to use the configuration option query_template => "template.json"?

Thank you

Hmmm. Actually I believe it calls:

GET logstash-data-*/_search?q=object:12345678

If you want to search on object.keyword, may be just do:

filter {
    elasticsearch {
	    hosts => ["elasticsearch:9200"]
	    index => ["logstash-data-*"]
	    query => "object.keyword:%{[data_object]}"
	    result_size => 1
        fields => {"some_field_in_logstash-data" => "some_field"}
    }
}
2 Likes

Thank you @dadoonet, just one more clarification please: if Elasticsearch finds more than one entry, given that result_size => 1, which will be the sorting order?
Is it correct to assume that the default is "sort" : [ { "@timestamp" : "desc" } ]?

[Edit] I tested the query

GET logstash-data-*/_search?q=object:12345678

Against my actual Elasticsearch indexes, and I receive the multiple results in a sort-of random ordering (they are apparently ordered by "_score" but not by @timestamp or index name)

[Edit2] I think I need this query:

GET logstash-data-*/_search
{
  "size": 1,
  "sort" : [ { "@timestamp" : "desc" } ],
  "query": {
    "match": {
      "object.keyword": {
        "query": "12345678"
      }
    }
  }
}

Yes. It's by default sorted on _score.

If you wish to pass a more complex query, use a query_template.

I shared an example here:

elasticsearch {
  query_template => "search-by-name.json"
  index => ".bano"
  fields => {
    "location" => "[location]"
    "address" => "[address]"
  }
  remove_field => ["headers", "host", "@version", "@timestamp"]
}
{
  "size": 1,
  "query":{
    "bool": {
      "should": [
        {
          "match": {
            "address.number": "%{[address][number]}"
          }
        },
        {
          "match": {
            "address.street_name": "%{[address][street_name]}"
          }
        },
        {
          "match": {
            "address.city": "%{[address][city]}"
          }
        }
      ]
    }
  }
}
1 Like

@dadoonet thank you so much for your time, your answers have been really helpful!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.