Scaling Writes/Indexing


(Christopher Jones) #1

Dear Folks:

I am running some scaling tests on elastic search. We are considering using
elastic search to store a large volume of log file data. A key metric for
us is the write throughput. We've been trying to understand how
Elasticsearch can scale. We were hoping to see some form of linear scaling.
That is, add in new nodes, increase the shard count, etc, in order to be
able to handle an increased number of writes. No matter what lever we pull,
we're not seeing an increase in write throughput. Clearly we're missing
something.

We have tested several different clusters on AWS (details below). We have
doubled the number of nodes; we have doubled (and doubled again) the number
of shards. We have doubled the number of servers feeding the cluster log
file data. We have put the nodes in the cluster behind an AWS load
balancer. We have increased the number of open files allowed by ubuntu.
None of these changes has had any effect on the number of records indexed
per second.

Details:

We are piping json data using fluentd, using the fluentd elastic search
plugin, storing the log file entries using the logstash format, which is
supported by the fluentd elastic search plugin. Note that we are not using
logstash as a data pipeline.

We configured the cluster with only 1 replica, the default.

We increased the ulimit for open files to 32000

Elastic search configuration:

discovery.type: ec2

cloud:
aws:

    region: us-west-2
    groups: elastic-search

cloud.node.auto_attributes: true

index.number_of_shards: 40 # we varied this between 10-40
index.refresh_interval: 60s

bootstrap.mlockall: true

discovery.zen.ping.multicast.enabled: false

Here's how we setup the memory configuration in setup.sh

#!/bin/sh
export ES_MIN_MEM=16g
export ES_MAX_MEM=30g
bin/elasticsearch -f

Each node in the cluster is m1.xlarge, which has the following
characteristics:

Instance Family Instance Type Processor Arch vCPU ECU Memory
(GiB)
Instance Storage (GB) EBS-optimized Available Network
Performance
General purpose m1.xlarge 64-bit 4 8 15 4 x 420 Yes High

We've monitored the cluster using http://www.elastichq.org. For each index,
we calculated (Indexing Total)/(Indexing Time) to get the number of records
indexed per second. Whether we have a single node with 10 shards or 6 nodes
with 40 shards, we consistently see indexing occurring at a rate of around
3000-3500 records/second. It is this measure that we never seem to be able
to increase.

Here's some characteristic stats for an individual node:

Heap Used: 1.1gb
Heap Committed: 7.9gb
Non Heap Used: 42.3mb
Non Heap Committed: 66.2mb
JVM Uptime: 4h
Thread Count/Peak: 62 / 82
GC Count: 8156
GC Time: 4.8m
Java Version: 1.7.0_45
JVM Vendor: Oracle Corporation
JVM: Java HotSpot(TM) 64-Bit Server VM

We have not seen memory contention, or high CPU utilization, in general
(CPU utilization is around 80% out of 400% possible). We're not doing any
reading of the database (that would be a subsequent test).

Here's some more stats:
Open File Descriptors:795CPU Usage:68% of 400%CPU System:1.5mCPU User:26.6mCPU
Total:28.2mResident Memory:8.4gbShared Memory:19.6mbTotal Virtual Memory:
10.3gb

Thanks for any help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7cbf8e9d-6e14-4154-8f95-aa28e1dc5934%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

From what I understand in the fluentd plugin code

only a single HTTP connection to a single node is opened in a single
threaded manner. This does not scale from the point of view of document
generating. If you want to increase throughput in the push process, you
should multiply your feeds, or throughput will stay the same.

As a side note, the fluentd plugin code does not examine bulk responses for
errors returned from the cluster, which makes the plugin error-prone.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH2zBpXg2Un3F21n1cHYUy9K%3D5qnGCa6v-94i7edZUO%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Christopher Jones) #3

We have setup multiple senders of data into elastic search, each sender
running on a separate box. So, that doubled the number of writers into
elastic search. Doubling the number of writers did not affect write
throughput. I'll double the number of writers a second time (from 2 to 4,
to see if that produces a change).

On Wednesday, December 4, 2013 10:22:19 AM UTC-8, Jörg Prante wrote:

From what I understand in the fluentd plugin code

https://github.com/uken/fluent-plugin-elasticsearch/blob/master/lib/fluent/plugin/out_elasticsearch.rb

only a single HTTP connection to a single node is opened in a single
threaded manner. This does not scale from the point of view of document
generating. If you want to increase throughput in the push process, you
should multiply your feeds, or throughput will stay the same.

As a side note, the fluentd plugin code does not examine bulk responses
for errors returned from the cluster, which makes the plugin error-prone.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/90f9c987-bc92-4797-9a5c-3660387d1bbf%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Christopher Jones) #4

One closely related question. What does the index time statistic for an
index mean? It's pretty clearly not wall-clock time. Is it cpu time across
all CPU's in the cluster? Is it something else? If it's cpu time, then the
statistic (number of documents indexed/indexing time) I was calculating to
gauge indexing speed is not the right statistic. This may be the actual
source of why I wasn't seeing linear scaling.

On Wednesday, December 4, 2013 10:22:19 AM UTC-8, Jörg Prante wrote:

From what I understand in the fluentd plugin code

https://github.com/uken/fluent-plugin-elasticsearch/blob/master/lib/fluent/plugin/out_elasticsearch.rb

only a single HTTP connection to a single node is opened in a single
threaded manner. This does not scale from the point of view of document
generating. If you want to increase throughput in the push process, you
should multiply your feeds, or throughput will stay the same.

As a side note, the fluentd plugin code does not examine bulk responses
for errors returned from the cluster, which makes the plugin error-prone.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/711e9755-3664-414a-971e-7955b2931684%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #5

If you mean localhost:9200/_stats/indexing "index_time_in_millis" output,
these are clock ticks (elapsed time, not wall clock). The clock ticks are
measured that are spent in internal "engine" methods (Lucene execution
time). The stats are dynamic, counting from since the node started,
measured on each shard by System.nanoTime(), converted to milliseconds for
output, and summed up over the shards for an index.

Because a JVM uses the "most precise available system timer" for
System.nanoTime(), these stats are not comparable across different JVMs and
OSes.

For a comparable indexing throughput number, you should take measurements
at the client side, and subtract network latency.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGoT1Y%3DqfSKovqTxpchbuvh1fwpM-TnM%3DDy4BZf-cTDoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #6

You should set ES MIN and MAX to be the same figure, that figure being 50%
of your total system memory size. This may or may not affect your tests but
it's best practise.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 5 December 2013 04:54, Christopher Jones chrjones17@gmail.com wrote:

Dear Folks:

I am running some scaling tests on elastic search. We are considering
using elastic search to store a large volume of log file data. A key metric
for us is the write throughput. We've been trying to understand how
Elasticsearch can scale. We were hoping to see some form of linear scaling.
That is, add in new nodes, increase the shard count, etc, in order to be
able to handle an increased number of writes. No matter what lever we pull,
we're not seeing an increase in write throughput. Clearly we're missing
something.

We have tested several different clusters on AWS (details below). We have
doubled the number of nodes; we have doubled (and doubled again) the number
of shards. We have doubled the number of servers feeding the cluster log
file data. We have put the nodes in the cluster behind an AWS load
balancer. We have increased the number of open files allowed by ubuntu.
None of these changes has had any effect on the number of records indexed
per second.

Details:

We are piping json data using fluentd, using the fluentd elastic search
plugin, storing the log file entries using the logstash format, which is
supported by the fluentd elastic search plugin. Note that we are not using
logstash as a data pipeline.

We configured the cluster with only 1 replica, the default.

We increased the ulimit for open files to 32000

Elastic search configuration:

discovery.type: ec2

cloud:
aws:

    region: us-west-2
    groups: elastic-search

cloud.node.auto_attributes: true

index.number_of_shards: 40 # we varied this between 10-40
index.refresh_interval: 60s

bootstrap.mlockall: true

discovery.zen.ping.multicast.enabled: false

Here's how we setup the memory configuration in setup.sh

#!/bin/sh
export ES_MIN_MEM=16g
export ES_MAX_MEM=30g
bin/elasticsearch -f

Each node in the cluster is m1.xlarge, which has the following
characteristics:

Instance Family Instance Type Processor Arch vCPU ECU Memory
(GiB)
Instance Storage (GB) EBS-optimized Available Network
Performance
General purpose m1.xlarge 64-bit 4 8 15 4 x 420 Yes High

We've monitored the cluster using http://www.elastichq.org. For each
index, we calculated (Indexing Total)/(Indexing Time) to get the number of
records indexed per second. Whether we have a single node with 10 shards or
6 nodes with 40 shards, we consistently see indexing occurring at a rate of
around 3000-3500 records/second. It is this measure that we never seem to
be able to increase.

Here's some characteristic stats for an individual node:

Heap Used: 1.1gb
Heap Committed: 7.9gb
Non Heap Used: 42.3mb
Non Heap Committed: 66.2mb
JVM Uptime: 4h
Thread Count/Peak: 62 / 82
GC Count: 8156
GC Time: 4.8m
Java Version: 1.7.0_45
JVM Vendor: Oracle Corporation
JVM: Java HotSpot(TM) 64-Bit Server VM

We have not seen memory contention, or high CPU utilization, in general
(CPU utilization is around 80% out of 400% possible). We're not doing any
reading of the database (that would be a subsequent test).

Here's some more stats:
Open File Descriptors: 795 CPU Usage: 68% of 400% CPU System: 1.5m CPU
User:26.6m CPU Total: 28.2m Resident Memory: 8.4gb Shared Memory: 19.6mbTotal Virtual Memory:10.3gb

Thanks for any help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7cbf8e9d-6e14-4154-8f95-aa28e1dc5934%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YRfRRMHipt-wkZx6JCmYUnNYy_BxuY9fuYtYzVmpUksg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Radu Gheorghe) #7

+1 to Mark's comment. If I'm reading your numbers correctly, it looks like
you have 16-30GB to ES, but only 8GB of RAM. And you seem to have 8.4GB
resident memory. Does this mean the machine is swapping? Either way, it
looks like you have too much memory allocated to ES. The rule of thumb is
to leave half of your RAM for the OS to cache your indices, otherwise
you'll hit the disk too often.

The next step, I think, is to look at what's your bottleneck? Is your
cluster idling because the single-threaded client sending too little data?
Maybe you can use more clients and/or bigger bulk sizes. CPU doesn't seem
to be an issue, but maybe I/O is. You may want to look at store throttling
settingshttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html,
the index buffer
sizehttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-indices.htmland
the transaction
loghttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-translog.htmlsettings.

If you have some time, check out my talk from this year's Monitorama
EUhttp://vimeo.com/75665562,
which is about using Elasticsearch for logs. The first part is a lot about
performance tweaks for this use-case.

On Wed, Dec 4, 2013 at 11:51 PM, Mark Walkom markw@campaignmonitor.comwrote:

You should set ES MIN and MAX to be the same figure, that figure being 50%
of your total system memory size. This may or may not affect your tests but
it's best practise.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 5 December 2013 04:54, Christopher Jones chrjones17@gmail.com wrote:

Dear Folks:

I am running some scaling tests on elastic search. We are considering
using elastic search to store a large volume of log file data. A key metric
for us is the write throughput. We've been trying to understand how
Elasticsearch can scale. We were hoping to see some form of linear scaling.
That is, add in new nodes, increase the shard count, etc, in order to be
able to handle an increased number of writes. No matter what lever we pull,
we're not seeing an increase in write throughput. Clearly we're missing
something.

We have tested several different clusters on AWS (details below). We
have doubled the number of nodes; we have doubled (and doubled again) the
number of shards. We have doubled the number of servers feeding the cluster
log file data. We have put the nodes in the cluster behind an AWS load
balancer. We have increased the number of open files allowed by ubuntu.
None of these changes has had any effect on the number of records indexed
per second.

Details:

We are piping json data using fluentd, using the fluentd elastic search
plugin, storing the log file entries using the logstash format, which is
supported by the fluentd elastic search plugin. Note that we are not using
logstash as a data pipeline.

We configured the cluster with only 1 replica, the default.

We increased the ulimit for open files to 32000

Elastic search configuration:

discovery.type: ec2

cloud:
aws:

    region: us-west-2
    groups: elastic-search

cloud.node.auto_attributes: true

index.number_of_shards: 40 # we varied this between 10-40
index.refresh_interval: 60s

bootstrap.mlockall: true

discovery.zen.ping.multicast.enabled: false

Here's how we setup the memory configuration in setup.sh

#!/bin/sh
export ES_MIN_MEM=16g
export ES_MAX_MEM=30g
bin/elasticsearch -f

Each node in the cluster is m1.xlarge, which has the following
characteristics:

Instance Family Instance Type Processor Arch vCPU ECU Memory
(GiB)
Instance Storage (GB) EBS-optimized Available Network
Performance
General purpose m1.xlarge 64-bit 4 8 15 4 x 420 Yes High

We've monitored the cluster using http://www.elastichq.org. For each
index, we calculated (Indexing Total)/(Indexing Time) to get the number of
records indexed per second. Whether we have a single node with 10 shards or
6 nodes with 40 shards, we consistently see indexing occurring at a rate of
around 3000-3500 records/second. It is this measure that we never seem to
be able to increase.

Here's some characteristic stats for an individual node:

Heap Used: 1.1gb
Heap Committed: 7.9gb
Non Heap Used: 42.3mb
Non Heap Committed: 66.2mb
JVM Uptime: 4h
Thread Count/Peak: 62 / 82
GC Count: 8156
GC Time: 4.8m
Java Version: 1.7.0_45
JVM Vendor: Oracle Corporation
JVM: Java HotSpot(TM) 64-Bit Server VM

We have not seen memory contention, or high CPU utilization, in general
(CPU utilization is around 80% out of 400% possible). We're not doing any
reading of the database (that would be a subsequent test).

Here's some more stats:
Open File Descriptors: 795 CPU Usage: 68% of 400% CPU System: 1.5m CPU
User: 26.6m CPU Total: 28.2m Resident Memory: 8.4gb Shared Memory: 19.6mbTotal Virtual Memory:10.3gb

Thanks for any help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7cbf8e9d-6e14-4154-8f95-aa28e1dc5934%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624YRfRRMHipt-wkZx6JCmYUnNYy_BxuY9fuYtYzVmpUksg%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2-P6R-Ls1844vmLbujK6z9AsvNR%3DF0%3D%2BSJopSPk6e5Yg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(pranav amin) #8

Hi Chris,

Did you got better TPS after doing some change. Can you share your
experience?
If so, which parameter did you tweak to get Linear scaling of ES.

We are kind of in the same situation, we aren't able to see better Write
TPS by adding more nodes.

Thanks in advance for your help.
Pranav.

On Wednesday, December 4, 2013 12:54:23 PM UTC-5, Christopher Jones wrote:

Dear Folks:

I am running some scaling tests on elastic search. We are considering
using elastic search to store a large volume of log file data. A key metric
for us is the write throughput. We've been trying to understand how
Elasticsearch can scale. We were hoping to see some form of linear scaling.
That is, add in new nodes, increase the shard count, etc, in order to be
able to handle an increased number of writes. No matter what lever we pull,
we're not seeing an increase in write throughput. Clearly we're missing
something.

We have tested several different clusters on AWS (details below). We have
doubled the number of nodes; we have doubled (and doubled again) the number
of shards. We have doubled the number of servers feeding the cluster log
file data. We have put the nodes in the cluster behind an AWS load
balancer. We have increased the number of open files allowed by ubuntu.
None of these changes has had any effect on the number of records indexed
per second.

Details:

We are piping json data using fluentd, using the fluentd elastic search
plugin, storing the log file entries using the logstash format, which is
supported by the fluentd elastic search plugin. Note that we are not using
logstash as a data pipeline.

We configured the cluster with only 1 replica, the default.

We increased the ulimit for open files to 32000

Elastic search configuration:

discovery.type: ec2

cloud:
aws:

    region: us-west-2
    groups: elastic-search

cloud.node.auto_attributes: true

index.number_of_shards: 40 # we varied this between 10-40
index.refresh_interval: 60s

bootstrap.mlockall: true

discovery.zen.ping.multicast.enabled: false

Here's how we setup the memory configuration in setup.sh

#!/bin/sh
export ES_MIN_MEM=16g
export ES_MAX_MEM=30g
bin/elasticsearch -f

Each node in the cluster is m1.xlarge, which has the following
characteristics:

Instance Family Instance Type Processor Arch vCPU ECU Memory
(GiB)
Instance Storage (GB) EBS-optimized Available Network
Performance
General purpose m1.xlarge 64-bit 4 8 15 4 x 420 Yes High

We've monitored the cluster using http://www.elastichq.org. For each
index, we calculated (Indexing Total)/(Indexing Time) to get the number of
records indexed per second. Whether we have a single node with 10 shards or
6 nodes with 40 shards, we consistently see indexing occurring at a rate of
around 3000-3500 records/second. It is this measure that we never seem to
be able to increase.

Here's some characteristic stats for an individual node:

Heap Used: 1.1gb
Heap Committed: 7.9gb
Non Heap Used: 42.3mb
Non Heap Committed: 66.2mb
JVM Uptime: 4h
Thread Count/Peak: 62 / 82
GC Count: 8156
GC Time: 4.8m
Java Version: 1.7.0_45
JVM Vendor: Oracle Corporation
JVM: Java HotSpot(TM) 64-Bit Server VM

We have not seen memory contention, or high CPU utilization, in general
(CPU utilization is around 80% out of 400% possible). We're not doing any
reading of the database (that would be a subsequent test).

Here's some more stats:
Open File Descriptors:795CPU Usage:68% of 400%CPU System:1.5mCPU User:
26.6mCPU Total:28.2mResident Memory:8.4gbShared Memory:19.6mbTotal
Virtual Memory:10.3gb

Thanks for any help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a17430a3-4493-40a9-9b7c-74f2029e943a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Yehosef) #9

There are good resources for this;
http://edgeofsanity.net/article/2012/12/26/elasticsearch-for-logging.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/capacity-planning.html
https://www.loggly.com/blog/nine-tips-configuring-elasticsearch-for-high-performance/

You should also note that you may have too many shards. Basically for optimal performance you shouldn't have more than 1 shard per elasticsearch instance. Often people choose a little more because once you set the shards for an index you basically cannot change it. So if you expect to grow the cluster sometimes it makes sense to add a few more shards than machines - but I don't think you should have more than 2 per machine.

In our config we are not concerned with losing data so we have the replica set to 0. You CAN change the replication later so after the data is ingested you can increase this if you want more reliability or read throughput.

You should just set the mem to use 50% of the memory and use bootstrap.mlockall = true; You can increase the %mem for bulk indexing. You can tweak the threadpool.bulk.queue_size to make sure you can handle the bulk inputs. You can also tweak the number of records per bulk insert - this is some tweaking based on the doc size, smaller documents you can add more, larger docs, less.

You should tune your index templates and avoid analyzing anything you don't need. It slows things down and consumes more disk without any benefit (if you're not using it - which usually you're not for metrics) The default logstash config doesn't know how you are going to use the data so all the text fields are stored indexed and non-indexed. If you know the structure of your data you should tweak this.

We have a 3 node cluster with 32G each ( also running aerospike so we allocate 12GB for ES) and we easily hit 10k inserts of 500-1000b records. I think the number can be much higher but we haven't done more to tweak the logstash flow.

And as pointed out above, you should make sure you're not swapping - that will kill everything.


(system) #10