High cpu usage on large ec2 nodes

rohit_reddy · January 28, 2013, 5:52pm

Hi,

I'm pretty new to elasticsearch, though i have extensively used lucene.
We are currently migrating from lucene to elasticsearch in our project.

We create a basic elasticsearch setup on AWS cloud and are trying to test
the performance of the same.

The configuration:
EC2 Nodes - 2 Large nodes
Shards - 5
Replication - 1
Memory settings - 4GB

We have created a basic index whose size is about 7GB. For the performance
tests, we have pretty much maintained a constant index, ie., the index is
not getting updated. There are no index events to the elasticsearch server.

Not we are bombarding *each *elasticsearch node with about 100 search
requets per sec (using a single jmeter client for this). Each search query
is a boolean query with 5-6 term query criteria.

For this load the CPU utilization is going upto 75%. The performance of
each query is still good. One query took about* 90ms* to return the result.

We then reduced the shards to 3 and ran the same tests.
The CPU usage remained the same but the performance degraded. Now each
request took about 180ms to return the result.

We expected the results to improve since we reduced the number of shards.
Not the opposite happened. Is this the expected result.
And is the high CPU usage also expected?

Thanks
Rohit

--

karmi · January 29, 2013, 7:42am

In general, yes, decreasing the number of shards should improve search
performance (less Lucene indices to search against), but I suspect in your
benchmarking scenario, there are many variables and it's hard to keep them
consistent:

The m1.large instance type is quite small, in a sense it has lot of
"neighbours" -- you never know who is doing what in the same rack
The m2.xlarge is better in this sense, and also allows you to use the
high I/O EBS volumes
A lot depends on the disk used for ES -- are you using the EBS-backed
instance disk? The "physical" ephemeral disk for the instance? Extra EBS
volume, possibly IOPS?
Regarding the CPU, I'd say it's expected you'll saturate the resources of
the machine at one point, and ~100 req/sec sounds kinda OK to me for the
type of machine in question. You can use the hot_threads API to check
where the time is
spent: Elasticsearch Platform — Find real-time answers at scale | Elastic

Karel

On Monday, January 28, 2013 6:52:19 PM UTC+1, rohit reddy wrote:

Hi,

I'm pretty new to elasticsearch, though i have extensively used lucene.
We are currently migrating from lucene to elasticsearch in our project.

We create a basic elasticsearch setup on AWS cloud and are trying to test
the performance of the same.

The configuration:
EC2 Nodes - 2 Large nodes
Shards - 5
Replication - 1
Memory settings - 4GB

We have created a basic index whose size is about 7GB. For the performance
tests, we have pretty much maintained a constant index, ie., the index is
not getting updated. There are no index events to the elasticsearch server.

Not we are bombarding *each *elasticsearch node with about 100 search
requets per sec (using a single jmeter client for this). Each search query
is a boolean query with 5-6 term query criteria.

For this load the CPU utilization is going upto 75%. The performance of
each query is still good. One query took about* 90ms* to return the
result.

We then reduced the shards to 3 and ran the same tests.
The CPU usage remained the same but the performance degraded. Now each
request took about 180ms to return the result.

We expected the results to improve since we reduced the number of shards.
Not the opposite happened. Is this the expected result.
And is the high CPU usage also expected?

Thanks
Rohit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

rohit_reddy · January 31, 2013, 6:07pm

We are using ephemeral disk with s3 backup. Since we expect the performance
of ephemeral disk to be better than EBS. And since our index does not get
updated too frequently, the overhead of storing backups in S3 is not huge.

I'll see use the API and try to identify which resource is taking up the
CPU.

Thanks
Rohit

On Tuesday, January 29, 2013 1:12:34 PM UTC+5:30, Karel Minařík wrote:

In general, yes, decreasing the number of shards should improve search
performance (less Lucene indices to search against), but I suspect in your
benchmarking scenario, there are many variables and it's hard to keep them
consistent:

The m1.large instance type is quite small, in a sense it has lot of
"neighbours" -- you never know who is doing what in the same rack

The m2.xlarge is better in this sense, and also allows you to use the
high I/O EBS volumes

A lot depends on the disk used for ES -- are you using the EBS-backed
instance disk? The "physical" ephemeral disk for the instance? Extra EBS
volume, possibly IOPS?

Regarding the CPU, I'd say it's expected you'll saturate the resources
of the machine at one point, and ~100 req/sec sounds kinda OK to me for the
type of machine in question. You can use the hot_threads API to check
where the time is spent:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Karel

On Monday, January 28, 2013 6:52:19 PM UTC+1, rohit reddy wrote:

Hi,

I'm pretty new to elasticsearch, though i have extensively used lucene.
We are currently migrating from lucene to elasticsearch in our project.

We create a basic elasticsearch setup on AWS cloud and are trying to test
the performance of the same.

The configuration:
EC2 Nodes - 2 Large nodes
Shards - 5
Replication - 1
Memory settings - 4GB

We have created a basic index whose size is about 7GB. For the
performance tests, we have pretty much maintained a constant index, ie.,
the index is not getting updated. There are no index events to the
elasticsearch server.

Not we are bombarding *each *elasticsearch node with about 100 search
requets per sec (using a single jmeter client for this). Each search query
is a boolean query with 5-6 term query criteria.

For this load the CPU utilization is going upto 75%. The performance of
each query is still good. One query took about* 90ms* to return the
result.

We then reduced the shards to 3 and ran the same tests.
The CPU usage remained the same but the performance degraded. Now each
request took about 180ms to return the result.

We expected the results to improve since we reduced the number of shards.
Not the opposite happened. Is this the expected result.
And is the high CPU usage also expected?

Thanks
Rohit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

rohit_reddy · February 7, 2013, 9:01am

Attached the hot-thread snapshot using the elasticsearch api.
I'm using DFS_QUERY_THEN_FETCH for the search.

gist.github.com

https://gist.github.com/rohitreddy/4729660

Thread stack1

::: [Stakar][ivV90X9PSGaIqd_cgxNAZQ][inet[/10.64.29.3:9300]]{zone=zone1}
   
   26.0% (130ms out of 500ms) cpu usage by thread 'elasticsearch[Stakar][search][T#23]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
       java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
       java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
       java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)

This file has been truncated. show original

gistfile1.txt

::: [Stakar][ivV90X9PSGaIqd_cgxNAZQ][inet[/10.64.29.3:9300]]{zone=zone1}
   
   22.0% (110ms out of 500ms) cpu usage by thread 'elasticsearch[Stakar][search][T#35]'
     9/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
       java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
       java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
       java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)

This file has been truncated. show original

Seems like most of the threads on waiting on reading from lucene index. Is
the normal? or should i tweek some configurations to reduce this. Using all
defaults for now.

On Thursday, January 31, 2013 11:37:19 PM UTC+5:30, rohit reddy wrote:

We are using ephemeral disk with s3 backup. Since we expect the
performance of ephemeral disk to be better than EBS. And since our index
does not get updated too frequently, the overhead of storing backups in S3
is not huge.

I'll see use the API and try to identify which resource is taking up the
CPU.

Thanks
Rohit

On Tuesday, January 29, 2013 1:12:34 PM UTC+5:30, Karel Minařík wrote:

In general, yes, decreasing the number of shards should improve search
performance (less Lucene indices to search against), but I suspect in your
benchmarking scenario, there are many variables and it's hard to keep them
consistent:

The m1.large instance type is quite small, in a sense it has lot of
"neighbours" -- you never know who is doing what in the same rack

The m2.xlarge is better in this sense, and also allows you to use the
high I/O EBS volumes

A lot depends on the disk used for ES -- are you using the EBS-backed
instance disk? The "physical" ephemeral disk for the instance? Extra EBS
volume, possibly IOPS?

Regarding the CPU, I'd say it's expected you'll saturate the resources
of the machine at one point, and ~100 req/sec sounds kinda OK to me for the
type of machine in question. You can use the hot_threads API to check
where the time is spent:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Karel

On Monday, January 28, 2013 6:52:19 PM UTC+1, rohit reddy wrote:

Hi,

I'm pretty new to elasticsearch, though i have extensively used lucene.
We are currently migrating from lucene to elasticsearch in our project.

We create a basic elasticsearch setup on AWS cloud and are trying to
test the performance of the same.

The configuration:
EC2 Nodes - 2 Large nodes
Shards - 5
Replication - 1
Memory settings - 4GB

We have created a basic index whose size is about 7GB. For the
performance tests, we have pretty much maintained a constant index, ie.,
the index is not getting updated. There are no index events to the
elasticsearch server.

Not we are bombarding *each *elasticsearch node with about 100 search
requets per sec (using a single jmeter client for this). Each search query
is a boolean query with 5-6 term query criteria.

For this load the CPU utilization is going upto 75%. The performance of
each query is still good. One query took about* 90ms* to return the
result.

We then reduced the shards to 3 and ran the same tests.
The CPU usage remained the same but the performance degraded. Now each
request took about 180ms to return the result.

We expected the results to improve since we reduced the number of
shards. Not the opposite happened. Is this the expected result.
And is the high CPU usage also expected?

Thanks
Rohit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

kimchy · February 12, 2013, 10:35pm

It seems like its waiting most of the time on read. Which instance type of AWS are you using? Make sure to have ~50% of the memory allocated to ES (ES_HEAP_SIZE), and the other half to the OS.

Also, which java version are you using? Make sure you are on the latest 1.6 (update 34 and above) or 1.7. This makes a big difference and in older Linux distro, the default java provided is pretty old (4 years old).

I will shy away from the DFS_ type, typically, you don't really need it with big enough data set.

Last, the reason why more shards performed better is because, even on 2 nodes, each search request was being parallelized across more shards (and less data). Note, if you start running concurrent client tests, make sure to configure the search thread pool with a fixed thread size of about 4 times the CPUs you have, so it won't overflow the concurrent execution.

On Feb 7, 2013, at 10:01 AM, rohit reddy rohit.kommareddy@gmail.com wrote:

Attached the hot-thread snapshot using the elasticsearch api.
I'm using DFS_QUERY_THEN_FETCH for the search.

Thread stack - elasticsearch · GitHub

Seems like most of the threads on waiting on reading from lucene index. Is the normal? or should i tweek some configurations to reduce this. Using all defaults for now.

On Thursday, January 31, 2013 11:37:19 PM UTC+5:30, rohit reddy wrote:
We are using ephemeral disk with s3 backup. Since we expect the performance of ephemeral disk to be better than EBS. And since our index does not get updated too frequently, the overhead of storing backups in S3 is not huge.

I'll see use the API and try to identify which resource is taking up the CPU.

Thanks
Rohit

On Tuesday, January 29, 2013 1:12:34 PM UTC+5:30, Karel Minařík wrote:
In general, yes, decreasing the number of shards should improve search performance (less Lucene indices to search against), but I suspect in your benchmarking scenario, there are many variables and it's hard to keep them consistent:

The m1.large instance type is quite small, in a sense it has lot of "neighbours" -- you never know who is doing what in the same rack

The m2.xlarge is better in this sense, and also allows you to use the high I/O EBS volumes

A lot depends on the disk used for ES -- are you using the EBS-backed instance disk? The "physical" ephemeral disk for the instance? Extra EBS volume, possibly IOPS?

Regarding the CPU, I'd say it's expected you'll saturate the resources of the machine at one point, and ~100 req/sec sounds kinda OK to me for the type of machine in question. You can use the hot_threads API to check where the time is spent: Elasticsearch Platform — Find real-time answers at scale | Elastic

Karel

On Monday, January 28, 2013 6:52:19 PM UTC+1, rohit reddy wrote:
Hi,

I'm pretty new to elasticsearch, though i have extensively used lucene.
We are currently migrating from lucene to elasticsearch in our project.

We create a basic elasticsearch setup on AWS cloud and are trying to test the performance of the same.

The configuration:
EC2 Nodes - 2 Large nodes
Shards - 5
Replication - 1
Memory settings - 4GB

We have created a basic index whose size is about 7GB. For the performance tests, we have pretty much maintained a constant index, ie., the index is not getting updated. There are no index events to the elasticsearch server.

Not we are bombarding each elasticsearch node with about 100 search requets per sec (using a single jmeter client for this). Each search query is a boolean query with 5-6 term query criteria.

For this load the CPU utilization is going upto 75%. The performance of each query is still good. One query took about 90ms to return the result.

We then reduced the shards to 3 and ran the same tests.
The CPU usage remained the same but the performance degraded. Now each request took about 180ms to return the result.

We expected the results to improve since we reduced the number of shards. Not the opposite happened. Is this the expected result.
And is the high CPU usage also expected?

Thanks
Rohit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Very high CPU usage of elastic nodes Elasticsearch	6	2559	March 29, 2018
Massive performance issues on our production cluster Elasticsearch	5	2563	July 6, 2017
Elasticsearch high CPU usage on a mostly bulk indexing use case Elasticsearch	11	3421	June 23, 2020
Performance problems Elasticsearch	12	586	July 6, 2017
High CPU usage due to certain stack trace Elasticsearch	1	627	June 25, 2019

High cpu usage on large ec2 nodes

Related topics