How to index percolator field via logstash and json_encode filter?

I'm trying to index documents and populate a percolator field via logstash:

Index definition

PUT test_percolate
{
  "mappings": {
    "_doc": {
      "properties": {
        "search_name": {
          "type": "text"
        },
        "query": {
          "type": "percolator"
        }
      }
    }
  }
}

Index via Kibana

PUT test_percolate/_doc/1
{
  "search_name": "co ag",
  "query_to_percolate": {
    "match": {
      "search_name": {
        "query": "co ag",
        "operator": "and"
      }
    }
  }
}

This works fine, inserting via Kibana. Also works fine indexing via PHP Elastic Client.
However, I would like to get it to work via Logstash

My current logstash config is:

input {
  ...
  jdbc {
    statement => "SELECT
        search_name
        FROM some_table
        "
    }
  }
filter {
  json_encode {
    source => "search_name"
    add_field => {
      "query" => {
        "match" => {
          "search_name" => {
            "query" => "%{search_name}"
            "operator" => "and"
          }
        }
      }
    }
  }
}
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "<host_name>"
  "index" => "test_percolate"
  "document_type" => "_doc"
  }
}

I've tried various notations, but I'm not getting anywere with it and keep getting the following error:

[WARN ] 2021-07-05 17:52:17.013 [[main]>worker1] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"test_percolate", :routing=>nil, :_type=>"_doc"}, {"search_name"=>"\"ag & co.\"", "@version"=>"1", "@timestamp"=>2021-07-05T15:52:16.578Z, "query"=>"[\"match\", {\"search_name\"=>{\"query\"=>\"\"ag & co.\"\", \"operator\"=>\"and\"}}]"}], :response=>{"index"=>{"_index"=>"test_percolate", "_type"=>"_doc", "_id"=>"oOJed3oB_2KydUX5w3xK", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse", "caused_by"=>{"type"=>"parsing_exception", "reason"=>"[_na] query malformed, must start with start_object", "line"=>1, "col"=>180}}}}}

Does anyone have any pointers on how to index into percolator field via logstash?
Thanks!

After much trial & error, I finally got it working with the following config.

input {
  ...
  jdbc {
    statement => "SELECT
        search_name
        FROM some_table
        "
    }
  }
filter {
  json_encode {
    source => "search_name"
    target => "escaped_search_name"
  }
  mutate {
    add_field => {
      "[query]" => '{
        "match" => {
          "search_name" => {
            "query" => %{escaped_search_name}
            "operator" => "and"
          }
        }
      }'
    }
  }
  json {
    source => "query"
    target => "query"
  }
  mutate {
    remove_field => [ "escaped_search_name" ]
  }
}
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "<host_name>"
  "index" => "test_percolate"
  "document_type" => "_doc"
  }
}

Turns out, I had it backwards with the json filter and json_encode filter the whole time. To pass on a structured json, the json filter should be used.

The filter parts explained:

  1. json_encode constructs escaped / encoded input before it used in the construction of json string
  2. mutate add_field: create a new field [query] with its value being a json string ( '{...}' ). It is important to have the field as [field_name], otherwise the next step to parse the string into json object doesn't work
  3. json: parse the string representation into a json object
  4. mutate remove_field: get rid of the escaped field, if it shouldn't be indexed
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.