10% CPU load without querying

maho · June 9, 2011, 11:26am

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

Clinton_Gormley · June 9, 2011, 12:06pm

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

github.com/elastic/elasticsearch

Aliases: add an ability to specify filters on aliases

opened 03:23PM - 26 May 11 UTC

closed 03:53PM - 26 May 11 UTC

imotov

>feature v0.17.0

An alias can now have a filter associated with it. Aliases with filters provide …an easy way to create different "views" of the same index. The filter can be defined using Query DSL and is applied to all Search, Count, Delete By Query and More Like This operations with this alias. <pre> $ curl -XPOST 'http://localhost:9200/_aliases' -d ' { "actions" : [ { "add" : { "index" : "test1", "alias" : "alias2", "filter" : { "term" : { "user" : "kimchy" } } } } ] }' </pre>

clint

maho · June 9, 2011, 2:23pm

Thanks for your answer.

I've already set the shards per index to 1.
Currently I'm evaluating solr and elasticsearch.
In Solr, the startup with 1000 indices takes 1 or 2 minutes and after
startup the cpu load is 0%.
So I'm a little bit confused why elasticsearch is so resource
intensive.

Why 100 indices? Because every customer has its own index. Using alias
filter could be a solution. But I think that would affect the scoring
of the documents because the ranking algorithm will be calculated on
the basis of all documents (=> inverse document frequency)?!

On 9 Jun., 14:06, Clinton Gormley clin...@iannounce.co.uk wrote:

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

Aliases: add an ability to specify filters on aliases · Issue #971 · elastic/elasticsearch · GitHub

clint

kimchy · June 9, 2011, 11:46pm

After things have settles (cluster health is green), then you should not get high CPU load. I just did a quick test on my machine, and I get 0.2% usage with 1000 indices.

The reason why recovery takes time and its resource intensive is mainly due to the fact that for each shard, the transaction log is replayed (elasticsearch does not require commit to be issued for data to be "safe"). You can issue a flush to flush the transaction log before you shutdown, and then there won't be anything to replay.

On Thursday, June 9, 2011 at 5:23 PM, maho wrote:

Thanks for your answer.

I've already set the shards per index to 1.
Currently I'm evaluating solr and elasticsearch.
In Solr, the startup with 1000 indices takes 1 or 2 minutes and after
startup the cpu load is 0%.
So I'm a little bit confused why elasticsearch is so resource
intensive.

Why 100 indices? Because every customer has its own index. Using alias
filter could be a solution. But I think that would affect the scoring
of the documents because the ranking algorithm will be calculated on
the basis of all documents (=> inverse document frequency)?!

On 9 Jun., 14:06, Clinton Gormley <clin...@iannounce.co.uk (http://iannounce.co.uk)> wrote:

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

Aliases: add an ability to specify filters on aliases · Issue #971 · elastic/elasticsearch · GitHub

clint

maho · June 10, 2011, 12:04pm

Now I deleted every index and created 1000 empty indices:

1000 opened indices: 8% CPU load (8 cpu cores, so it will be about
1,00%)
1000 closed indices: 0-1% CPU load (8 cpu cores, so it will be about
0,06%)

The bad thing is, that the queries per second (JMeter test) decreased
by 20%

Is that ok or is something wrong with my installation /
configuration / system?

On 10 Jun., 01:46, Shay Banon shay.ba...@elasticsearch.com wrote:

After things have settles (cluster health is green), then you should not get high CPU load. I just did a quick test on my machine, and I get 0.2% usage with 1000 indices.

The reason why recovery takes time and its resource intensive is mainly due to the fact that for each shard, the transaction log is replayed (elasticsearch does not require commit to be issued for data to be "safe"). You can issue a flush to flush the transaction log before you shutdown, and then there won't be anything to replay.

On Thursday, June 9, 2011 at 5:23 PM, maho wrote:

Thanks for your answer.

I've already set the shards per index to 1.
Currently I'm evaluating solr and elasticsearch.
In Solr, the startup with 1000 indices takes 1 or 2 minutes and after
startup the cpu load is 0%.
So I'm a little bit confused why elasticsearch is so resource
intensive.

Why 100 indices? Because every customer has its own index. Using alias
filter could be a solution. But I think that would affect the scoring
of the documents because the ranking algorithm will be calculated on
the basis of all documents (=> inverse document frequency)?!

On 9 Jun., 14:06, Clinton Gormley <clin...@iannounce.co.uk (http://iannounce.co.uk)> wrote:

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

Aliases: add an ability to specify filters on aliases · Issue #971 · elastic/elasticsearch · GitHub

clint

maho · June 11, 2011, 5:21pm

I now installed ES on my windows machine and the problems didn't
appear.
And I didn't get the DEBUG messages like in linux (used same config on
both system):

...
[2011-06-11 19:15:06,007[DEBUG][gateway.local] [Buzz] [core0][0]:
throttling allocation [[core0][0], node[null]], [P], s[UNASSIGNED]] to
[[Buzz][bfMpEcLeS-qBG8tT_QN_sw][inet[/10.0.2.15:9300]]] on primary
allocation

[2011-06-11 19:15:06,784[DEBUG][gateway.local] [Buzz] [core1][0]:
throttling allocation [[core1][0], node[null]], [P], s[UNASSIGNED]] to
[[Buzz][bfMpEcLeS-qBG8tT_QN_sw][inet[/10.0.2.15:9300]]] on primary
allocation
...

On 10 Jun., 14:04, maho mathias.hod...@gmail.com wrote:

Now I deleted every index and created 1000 empty indices:

1000 opened indices: 8% CPU load (8 cpu cores, so it will be about
1,00%)
1000 closed indices: 0-1% CPU load (8 cpu cores, so it will be about
0,06%)

The bad thing is, that the queries per second (JMeter test) decreased
by 20%

Is that ok or is something wrong with my installation /
configuration / system?

On10Jun., 01:46, Shay Banon shay.ba...@elasticsearch.com wrote:

After things have settles (cluster health is green), then you should not get high CPU load. I just did a quick test on my machine, and I get 0.2% usage with 1000 indices.

The reason why recovery takes time and its resource intensive is mainly due to the fact that for each shard, the transaction log is replayed (elasticsearch does not require commit to be issued for data to be "safe"). You can issue a flush to flush the transaction log before you shutdown, and then there won't be anything to replay.

On Thursday, June 9, 2011 at 5:23 PM, maho wrote:

Thanks for your answer.

I've already set the shards per index to 1.
Currently I'm evaluating solr and elasticsearch.
In Solr, the startup with 1000 indices takes 1 or 2 minutes and after
startup the cpu load is 0%.
So I'm a little bit confused why elasticsearch is so resource
intensive.

Why 100 indices? Because every customer has its own index. Using alias
filter could be a solution. But I think that would affect the scoring
of the documents because the ranking algorithm will be calculated on
the basis of all documents (=> inverse document frequency)?!

On 9 Jun., 14:06, Clinton Gormley <clin...@iannounce.co.uk (http://iannounce.co.uk)> wrote:

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

Aliases: add an ability to specify filters on aliases · Issue #971 · elastic/elasticsearch · GitHub

clint

kimchy · June 12, 2011, 7:13am

There isn't a difference between windows and linux in this case. The messages you see are elasticsearch throttling the concurrent allocation of shards on the same node so it won't be overloaded (and become unusable). By default, it allows for 4 concurrent allocations per node.

On Saturday, June 11, 2011 at 8:21 PM, maho wrote:

I now installed ES on my windows machine and the problems didn't
appear.
And I didn't get the DEBUG messages like in linux (used same config on
both system):

...
[2011-06-11 19:15:06,007[DEBUG][gateway.local] [Buzz] [core0][0]:
throttling allocation [[core0][0], node[null]], [P], s[UNASSIGNED]] to
[[Buzz][bfMpEcLeS-qBG8tT_QN_sw][inet[/10.0.2.15:9300]]] on primary
allocation

[2011-06-11 19:15:06,784[DEBUG][gateway.local] [Buzz] [core1][0]:
throttling allocation [[core1][0], node[null]], [P], s[UNASSIGNED]] to
[[Buzz][bfMpEcLeS-qBG8tT_QN_sw][inet[/10.0.2.15:9300]]] on primary
allocation
...

On 10 Jun., 14:04, maho <mathias.hod...@gmail.com (http://gmail.com)> wrote:

Now I deleted every index and created 1000 empty indices:

1000 opened indices: 8% CPU load (8 cpu cores, so it will be about
1,00%)
1000 closed indices: 0-1% CPU load (8 cpu cores, so it will be about
0,06%)

The bad thing is, that the queries per second (JMeter test) decreased
by 20%

Is that ok or is something wrong with my installation /
configuration / system?

On10Jun., 01:46, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

After things have settles (cluster health is green), then you should not get high CPU load. I just did a quick test on my machine, and I get 0.2% usage with 1000 indices.

The reason why recovery takes time and its resource intensive is mainly due to the fact that for each shard, the transaction log is replayed (elasticsearch does not require commit to be issued for data to be "safe"). You can issue a flush to flush the transaction log before you shutdown, and then there won't be anything to replay.

On Thursday, June 9, 2011 at 5:23 PM, maho wrote:

Thanks for your answer.

I've already set the shards per index to 1.
Currently I'm evaluating solr and elasticsearch.
In Solr, the startup with 1000 indices takes 1 or 2 minutes and after
startup the cpu load is 0%.
So I'm a little bit confused why elasticsearch is so resource
intensive.

Why 100 indices? Because every customer has its own index. Using alias
filter could be a solution. But I think that would affect the scoring
of the documents because the ranking algorithm will be calculated on
the basis of all documents (=> inverse document frequency)?!

On 9 Jun., 14:06, Clinton Gormley <clin...@iannounce.co.uk (http://iannounce.co.uk) (http://iannounce.co.uk)> wrote:

Hi Maho

On Thu, 2011-06-09 at 04:26 -0700, maho wrote:

Hi,

i have 1 node with 1000 indices. Each index has 100k documents and 15
fields.
If I start elasticsearch I have a very high cpu load of 100-200%
(4x2,7ghz) for 5 minutes.

After startup the cpu load is about 5-10%. Is that ok?

That sounds pretty normal, for a high number of indices.

Given that each index is small, I'd set each on to have only one primary
shard (number_of_shards) rather than the default 5.

This will improve start-uptime, performance and memory usage.

Also, why do you have so many indices? Could these not be combined?

You may be able to make use of the new alias filter functionality that
will be in 0.17:

Aliases: add an ability to specify filters on aliases · Issue #971 · elastic/elasticsearch · GitHub

clint

Topic		Replies	Views
High cpu load but low memory usage Elasticsearch	10	1767	July 6, 2017
High cpu load on load test with 300 rps Elasticsearch	8	1864	July 6, 2017
Elasticsearch High CPU high load when searching Elasticsearch	6	1232	July 6, 2017
Slow Query Performance Elasticsearch	10	798	July 6, 2017
Performance problems Elasticsearch	12	634	July 6, 2017

10% CPU load without querying

Related topics