Optimal merge policy for indexing log data

Jan_Fiedler · January 22, 2013, 5:16pm

I'm working on a little project that indexes log data (using one index per
day). For now I just have one ES node for testing doing the indexing at a
rather low volume (up to 1 million messages / day). While monitoring the
CPU consumption of that search node I found that it was slowly but overall
significantly increasing over the course of the 'indexing day'. I assume
this is due to the fact that the index size increases making certain
operations more expensive over time. I used the hot thread api and found
that index merge operations seem to be consuming that CPU which brings me
to the actual question:

What would be the recommended merge policy for this special log data set.
Its different from normal content in that it never changes (i.e. no
updates) and never gets deleted (only a whole index gets deleted). There
are only 'new' documents being added at a rather constant rate. I guess
there could be special merge settings for this type of traffic.

--

Topic		Replies	Views
Reindex all documents merge configuration Elasticsearch	3	396	July 6, 2017
Elasticsearch merge slowness Elasticsearch	3	531	July 6, 2017
Changing Merge Policy And Optimization Elasticsearch	4	880	July 6, 2017
How to set max_merged_segment at startup? Elasticsearch	10	2251	July 6, 2017
Why is ES memory consumption raising while I bulk index parent/child documents? Elasticsearch	8	918	July 6, 2017

Optimal merge policy for indexing log data

Related topics