Performance issues on EC2/EBS

Christopher_J_Bottar · May 21, 2013, 7:43pm

Hello,

We are not getting the desired performance out of our Elasticsearch
cluster. Here is the setup:

4 nodes
each node is an EC2 m1.large, 2 cpu, 7.5 gb memory
data on EBS volumes
1 index, 2,956,699 documents, 30 shards, 0 replicas
HAProxy round robin to each node, 1 second connection timeout, 5
second response timeout
A document consists of 15 to 50 fields
Fields are mostly not analyzed strings, longs, floats and a few dates.
We have no analyzed strings at all.
ES_HEAP_SIZE = 5gb (each node has total of 7.5 gb)
bootstrap.mlockall: true
indices.memory.index_buffer_size: 50%
index.refresh_interval: 30
index.translog.flush_threshold_ops: 50000

The profile of our work is many small jobs of:

Retrieve document (not search)
Change a field
Index document

When HAProxy reports a session rate of about 100 (which is to be read as
100 requests/sec, I think), we start getting connection timeouts and
response timeouts.

100 indexes/sec seems pretty low, even in spite of our very modest hardware.

We keep trying new things (adding more nodes, tuning, etc) and our
next experiment is to take the data off the EBS volumes. It takes a lot of
time and effort to try these experiments, so I was hoping to post our setup
here and maybe get a push in the right direction, rather than stumbling
around blindly.

Thanks for the help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

drewr · May 22, 2013, 4:25pm

Christopher J. Bottaro wrote:

We are not getting the desired performance out of our
Elasticsearch
cluster. Here is the setup:

4 nodes

each node is an EC2 m1.large, 2 cpu, 7.5 gb memory

data on EBS volumes

1 index, 2,956,699 documents, 30 shards, 0 replicas

HAProxy round robin to each node, 1 second connection timeout,
5
second response timeout

[...]

index.refresh_interval: 30

You probably mean 30s. Up this to minutes or even disable it (-1)
if you can afford the delay in docs showing up in search.

index.translog.flush_threshold_ops: 50000

Don't set this; the default should be fine.

When HAProxy reports a session rate of about 100 (which is to be
read
as 100 requests/sec, I think), we start getting connection
timeouts
and response timeouts.

This is the rate of new connections through the proxy. You want
this number to be as low as possible, around a handful per data
node. Make sure your HTTP client is using keep-alives.

EBS likely isn't your bottleneck here. You probably have some
client issues you could work out on a laptop first and avoid the
overhead of the deployment loop.

Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Performance problems Elasticsearch	12	590	July 6, 2017
ElasticSearch Performance Issues Elasticsearch	5	517	July 6, 2017
Cluster optimization(indexing/query performace) Elasticsearch	4	323	July 6, 2017
ElasticSearch on Amazon EC2 tips Elasticsearch	4	1575	July 6, 2017
EC2 Perfomance problems, advice needed Elasticsearch	19	491	July 6, 2017

Performance issues on EC2/EBS

Related topics