Importing to an existing index in Elastic doesn't behave as expected

I am exporting from one ELK stack using logstash. I am writing the output to a json gz file.

Then I am importing to another ELK stack also using logstash. There is no connection between the two environments.

This export/import can occur multiple times throughout the day.

I have noticed that when the the target index exists in 2nd environment, it seems to just append documents onto the end of the index. So, I might get 200k documents imported instead of maybe 5 million documents.

This is my import pipeline for the metricbeat index:

input {
  file {
    path => "/usr/share/logstash/export/export_metricbeat-7.17.7-2023.04.28-000007.json"
    start_position => "beginning"
    codec => "json"
    mode => "read"
    exit_after_read => true
  }
}

output {
  elasticsearch {
     hosts => "http://localhost:9200"
     index => "metricbeat-7.17.7-2023.04.28-000007"
     ssl => "false"
  }
}

If I had already done an import on the 28th of April, then this index will already exist in the target environment.

Is there a way in logstash to always import no matter what exists in Elastic? I thought it might be duplicate document ids, but I don't see how that's possible. The document ids should be unique.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.