How to improve performance of Re-indexing from logstash?

nethis · November 14, 2016, 11:04am

Hi There,

We have 460 GB of index, which is to be re-index , to make all fields not_analyzed.

We started re-indexed around 8 days back, and still in progress.(though there are environment issues on shards unavailable exception).

Surprisingly, I could see the new index size as 330GB and then after few minutes, it will reduce to 310 GB. Like this, its been continuing from last 5 days, without increase in the size limit or document count.

Are we missing any configuration here? Please help here

We have
refresh_interval : -1
Replicas:0
indices.memory.index_buffer_size:25%(25% of 56 GB)

logstash configuration as below:

input {
elasticsearch {
hosts => ["10.158.36.199"]
index => "customevent"
size => 5000
scroll => "20m"
docinfo => true
}
}
output
{
elasticsearch {
action => "index"
hosts => ["10.158.36.199"]
codec => json
index => "it_customevent"
document_type => "dailyaggregate"
document_id => "%{[@metadata][_id]}"
}
}

eperry · November 14, 2016, 6:42pm

I don't see anything , anything in the log files? Size may change if your mapping has changed, what is the document count?

FYI, this will never "End" because logstash is supposed to always run, so it will run for the next year if you leave it.

You may want to try just export and re-import the data https://github.com/taskrabbit/elasticsearch-dump if this is a one time mapping change. There are many other applications, Knapsack is another

nethis · November 18, 2016, 8:49am

Thanks Ed for the suggesting the tools on re-index.

Actually, we just thought of using re-indexing API by re-indexing each day at a time. And this helped me to finish the re-index task in 1-1.5 days(which is around 600 GB).

system · December 16, 2016, 8:49am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pointers to Improve indexing performance? Elasticsearch	6	2941	February 28, 2017
INDEX Performance Elasticsearch	15	696	July 19, 2018
Reindexing is slow process Elasticsearch reindex	7	5616	October 12, 2021
How to reduce the log size? Elasticsearch	10	4413	December 6, 2017
Indexing stops after 1000 records Logstash	1	609	July 6, 2017

How to improve performance of Re-indexing from logstash?

Related topics