Hello,
We are not getting the desired performance out of our Elasticsearch
cluster. Here is the setup:
- 4 nodes
- each node is an EC2 m1.large, 2 cpu, 7.5 gb memory
- data on EBS volumes
- 1 index, 2,956,699 documents, 30 shards, 0 replicas
- HAProxy round robin to each node, 1 second connection timeout, 5
second response timeout - A document consists of 15 to 50 fields
- Fields are mostly not analyzed strings, longs, floats and a few dates.
We have no analyzed strings at all. - ES_HEAP_SIZE = 5gb (each node has total of 7.5 gb)
- bootstrap.mlockall: true
- indices.memory.index_buffer_size: 50%
- index.refresh_interval: 30
- index.translog.flush_threshold_ops: 50000
The profile of our work is many small jobs of:
- Retrieve document (not search)
- Change a field
- Index document
When HAProxy reports a session rate of about 100 (which is to be read as
100 requests/sec, I think), we start getting connection timeouts and
response timeouts.
100 indexes/sec seems pretty low, even in spite of our very modest hardware.
We keep trying new things (adding more nodes, tuning, etc) and our
next experiment is to take the data off the EBS volumes. It takes a lot of
time and effort to try these experiments, so I was hoping to post our setup
here and maybe get a push in the right direction, rather than stumbling
around blindly.
Thanks for the help.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.