Importing to an existing index in Elastic doesn't behave as expected

eeijlar · May 2, 2023, 9:15am

I am exporting from one ELK stack using logstash. I am writing the output to a json gz file.

Then I am importing to another ELK stack also using logstash. There is no connection between the two environments.

This export/import can occur multiple times throughout the day.

I have noticed that when the the target index exists in 2nd environment, it seems to just append documents onto the end of the index. So, I might get 200k documents imported instead of maybe 5 million documents.

This is my import pipeline for the metricbeat index:

input {
  file {
    path => "/usr/share/logstash/export/export_metricbeat-7.17.7-2023.04.28-000007.json"
    start_position => "beginning"
    codec => "json"
    mode => "read"
    exit_after_read => true
  }
}

output {
  elasticsearch {
     hosts => "http://localhost:9200"
     index => "metricbeat-7.17.7-2023.04.28-000007"
     ssl => "false"
  }
}

If I had already done an import on the 28th of April, then this index will already exist in the target environment.

Is there a way in logstash to always import no matter what exists in Elastic? I thought it might be duplicate document ids, but I don't see how that's possible. The document ids should be unique.

system · May 30, 2023, 9:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Importing file to an existing index Logstash	5	1066	February 8, 2022
Using existing INDEX Logstash	1	220	April 6, 2019
Formatting problems when I import a metricbeat index from one elastic instance to another Elasticsearch	2	237	April 16, 2023
Cant append documents on existing index Logstash	4	969	June 20, 2017
Importing existing Lucene index Elasticsearch	2	512	July 6, 2017

Importing to an existing index in Elastic doesn't behave as expected

Related topics