Optimum index.translog.flush_threshold_ops setting

I'm not sure how to deduce the optimum
"index.translog.flush_threshold_ops" for my Logstash setup.
Logstash indexes are created on daily basis. I'd like some advises. I
hope these helps:

i. ES version- 0.90.0 stable
ii. ES index template
curl -XPUT -d ' {
"template" : "logstash*",
"settings" : { "number_of_shards" : 1,
"index.cache.field.type" : "soft",
"index.refresh_interval" : "10s",
"index.store.compress.stored" : true }
iii. Logstash pushes approx. 2800 logs/sec into ES.
iv. ES is tuned with
|||bootstrap.mlockall: ||true

Do I need further change other parameters mentioned here
other than the default?


You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


I think tuning the flush threshold with 2.8k logs/s is like tuning your
bulk size: increase it, then see if you get a significant performance gain.
If you do, increase it again until you don't get a significant gain.

The default setting of 5000 means you'll have a flush every 2 seconds or
even less. I'd make it 10 times more than that. Assuming your logs are
fairly small (under 1K or so), it shouldn't make up a huge transaction log.

Some other advice regarding your template:

  • in my experience, having 1 shard per index doesn't help you significantly
    in terms of search performance. Having 5 shards instead of 1, however,
    should help boost the indexing performance, and also gives you room for
    adding more nodes to host the same index
  • soft field caches will put pressure on your CPU because of the Garbage
    Collector. And CPU is a precious resource when indexing. Since you're on
    0.90, I suggest you cap the size of it by
    setting index.fielddata.cache.size to something like 20% or whatever fits
    your needs in terms of search performance vs used memory
  • increasing indices.memory.index_buffer_size from the default 10% might
    help your indexing speed. As with the translog, I think it's a matter of
    trial-and-error to get the right size
  • in 0.90, you get compression always enabled at Lucene-level,
    so "index.store.compress.stored" : true shouldn't have any effect

Best regards,

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Sat, May 4, 2013 at 8:03 AM, Subin ksubins321@gmail.com wrote:

I'm not sure how to deduce the optimum
"index.translog.flush_threshold_ops" for my Logstash setup.
Logstash indexes are created on daily basis. I'd like some advises. I hope
these helps:

i. ES version- 0.90.0 stable
ii. ES index template
curl -XPUT -d '

"template" :

"settings" : { "number_of_shards" : 1,
"index.cache.field.type" : "soft",
"index.refresh_interval" : "10s",
"index.store.compress.stored" : true }
iii. Logstash pushes approx. 2800 logs/sec into ES.
iv. ES is tuned with
bootstrap.mlockall: true

Do I need further change other parameters mentioned herehttp://www.elasticsearch.org/guide/reference/index-modules/translog/other than the default?


You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.