Understanding Index Buffer Size and Its Effects

Hey Guys,

I am tuning my Elasticsearch which consists of 3 dedicated servers:

  • 2 Octocore Processors (16 Cores), with HT (32 threads)
  • 32 GB RAM (DDR3)

We are running the latest beta2. We have allocated 16 GB to Elasticsearch.

We have only 1 index with 32 shards and 1 replica each, making it 64 shards
in the cluster.
We have a lot of updates queries. In fact, we have only update queries with
upserts. We are using routing to put similar data in the same shard. We see
the rate of update queries ranging between 20/sec to 60/sec. This is going
to increase to about 130-140/sec when we go live. Our queries are mostly
filtered queries with a lot of use of term faceting.

I have allocated 40% memory to index_buffer_size (this was very random),
which comes out to be around 6.4 GB. Average document size is 900 Bytes.

I have a feeling that the allocated buffer size is going waste as our
indexing/updating/upserting rates are not so high. If my reasoning is
correct, with 60 upserts/sec of docs that are of average size 1 KB,
index_buffer_size 60 x 1 KB x 32 (shards) x 3 (keeping room for 3 times
traffic) = 5760 KB (approx. 5 MB) should be more than enough. And by
allocating 40% of the JVM, I am wasting the heap which I could probably use
for Filter Cache and Field Data Cache.

Now I have the following questions:

  • Are my calculations correct?
  • Is there something else that I need to consider?
  • If the index_buffer_size is under utilized, will Elasticsearch be able
    to use a portion of it for something else when required?

Looking forward to some valuable feedback. Will be happy to share more
details if required.

Thanks,
Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5nGDrmFdEphRYjtQynMgrkfAePGMTjydSwzn4SYNVYZPA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

You can not waste heap. The percentage allocations are dynamic upper
bounds, where you tell ES "if you are about dynamically re-allocating
buffers, look if you have exceeded this limit".

To watch ES using dynamic buffer re-allocations, enable DEBUG level in the
logs.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH8eOrYvyTm%3DnM0x_yhazVNGJDO7oo-xJq-AniVZk3B%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.