Add new field to index using Logstash getting null values

Hi, currently i would like to add a new field gain in a current index based on a field value as follows:

input {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "my_index"
  }
}

filter {
  if [gain_factor] > "1.0" {
    mutate {
      add_field => [ "gain", "exist" ]
    }
  }
  else {
    mutate {
      add_field => [ "gain", "no-exist" ]
    }
  }
}

output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "my_index"
  }
  stdout {}
}

When i make a query as follows:

POST _xpack/sql?format=txt
{
"query" : "SELECT gain_factor, gain FROM my_index"
}

I'm getting that a lot documents in the field gain are null. my target is just add a new field in my index and set "exist" in gain field when gain_factor > 1.0 and when gain_factor < 1.0 set "no-exist" in gain field. How can i do that?

Take a look at the docinfo option on the elasticsearch input. You will need that to preserve the document_id so that you can overwrite an existing document.

The handling of document_type is probably going to confuse you. It is in the process of being retired, so the documentation says you need it (and you may) but if you use it then the software warns you not to use it.

Start off by setting 'docinfo => true' on the input and setting

document_id => "%{[@metadata][_id]}"

on the elasticsearch output.

Thanks! It works perfect!!!!

When the logstash finish the index processing stops and say:

[INFO ] 2019-02-07 16:25:47.671 [[main]-pipeline-manager] pipeline - Pipeline has terminated {:pipeline_id=>"main", :thread=>"#<Thread:0x4ea6dc5 run>"}

it is possible to keep the config file working in real time in such a way that when the index gets new data, it will be processed with logstash?

Run it using a schedule and update your query so it only fetches documents that do not have a gain field.

1 Like

how can i do that? and the shedule to make it in real time is this?

schedule => "* * * * *"

thanks for your help

That schedule would run the query once a second, yes. I have never run anything that often, so I do not know if it creates any issues.

As to what the query should be, that's an elasticsearch question, but you pass a query using the query option on the input.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.