High cpu usage (90-100%) on elastic search servers

Han_2 · June 5, 2013, 12:47am

we are going live with elasticsearch and as soon as we put production load
cpu spikes to 90-100%.. wondering whats causing the issue.

here is cluster info:
es version: 0.90.0
5 nodes each with following config:
64gb ram
ES_MIN_MEM = ES_MAX_MEM and set to 32gb
24 core

3 indices

5 shards, 4 Replicas
5 shards, 4 Replicas
2 shards, 4 replicas

total documents 5 million, and the total size of the indices = 45GB

attaching the the result of hotthreads command.

would appreciate your help on this.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · June 5, 2013, 2:53pm

A few questions:

What is you search load? How many requests per second. From the
hotthreads I see most of the time spend on searching.
What kind of searches are you executing? If possible, can you perhaps
them share examples of your queries via a gist?
How much of the heap space are you actually using? You can see this via
the node stats api (with jvm flag): http://localhost:9200/_nodes/stats?jvm

Martijn

On 5 June 2013 02:47, Han hradusumalli@gmail.com wrote:

we are going live with elasticsearch and as soon as we put production load
cpu spikes to 90-100%.. wondering whats causing the issue.

here is cluster info:
es version: 0.90.0
5 nodes each with following config:
64gb ram
ES_MIN_MEM = ES_MAX_MEM and set to 32gb
24 core

3 indices

5 shards, 4 Replicas

5 shards, 4 Replicas

2 shards, 4 replicas

total documents 5 million, and the total size of the indices = 45GB

attaching the the result of hotthreads command.

would appreciate your help on this.

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 6:24pm

Thanks Martijn for your response. Please see my response inline.

On Wednesday, June 5, 2013 7:53:51 AM UTC-7, Martijn v Groningen wrote:

A few questions:

What is you search load? How many requests per second. From the
hotthreads I see most of the time spend on searching.

Search load is 250 rps on a cluster of 5 beefy servers with 24core, 64GB
memory. The search response times with this load is less than a second and
is acceptable to us. Its just that CPU is shooting upto 100% with this load
and is dying after sometime.

What kind of searches are you executing? If possible, can you perhaps
them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.

gist.github.com

https://gist.github.com/anonymous/5715994

query

 {
  "from": 0,
  "size": 20,
  "query": {
    "filtered": {
      "query": {
        "custom_filters_score": {
          "query": {
            "bool": {
              "minimum_number_should_match": 1,

This file has been truncated. show original

How much of the heap space are you actually using? You can see this via
the node stats api (with jvm flag):
http://localhost:9200/_nodes/stats?jvm

i will gist this soon.

Martijn

On 5 June 2013 02:47, Han <hradus...@gmail.com <javascript:>> wrote:

we are going live with elasticsearch and as soon as we put production
load cpu spikes to 90-100%.. wondering whats causing the issue.

here is cluster info:
es version: 0.90.0
5 nodes each with following config:
64gb ram
ES_MIN_MEM = ES_MAX_MEM and set to 32gb
24 core

3 indices

5 shards, 4 Replicas

5 shards, 4 Replicas

2 shards, 4 replicas

total documents 5 million, and the total size of the indices = 45GB

attaching the the result of hotthreads command.

would appreciate your help on this.

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · June 5, 2013, 6:40pm

What kind of searches are you executing? If possible, can you perhaps

them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
query · GitHub

I see that you use the top level filter. Unless you are also using facets
(which is not case here), I would recommend putting all filters in the
filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 6:46pm

Thanks Martin.. i will look at the "bool" filter and see if we can upgrade
to 0.90.1, i will keep you posted.

here is the gist heap usage

gist.github.com

https://gist.github.com/anonymous/5716142

es node stats jvm

{
  "cluster_name" : "cluster_test",
  "nodes" : {
    "5B6RZqjNQle8QReug-O6qQ" : {
      "timestamp" : 1370455698718,
      "name" : "SOC14",
      "transport_address" : "",
      "hostname" : "host1",
      "indices" : {
        "docs" : {

This file has been truncated. show original

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen wrote:

What kind of searches are you executing? If possible, can you perhaps

them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
query · GitHub

I see that you use the top level filter. Unless you are also using facets
(which is not case here), I would recommend putting all filters in the
filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 7:19pm

gist on heal usage
here is the gist heap usage

gist.github.com

https://gist.github.com/anonymous/5716142

es node stats jvm

{
  "cluster_name" : "cluster_test",
  "nodes" : {
    "5B6RZqjNQle8QReug-O6qQ" : {
      "timestamp" : 1370455698718,
      "name" : "SOC14",
      "transport_address" : "",
      "hostname" : "host1",
      "indices" : {
        "docs" : {

This file has been truncated. show original

also, we have already enabled term vectors on the fields that we are doing
the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can upgrade
to 0.90.1, i will keep you posted.

here is the gist heap usage
es node stats jvm · GitHub

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
query · GitHub

I see that you use the top level filter. Unless you are also using facets
(which is not case here), I would recommend putting all filters in the
filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · June 5, 2013, 7:59pm

The actual heap usage (at most 3.9GB) is way lower than the allocated heap.
I assume you're not using faceting, script or sorting by a field, right? If
that is the case I'd lower the ES_HEAP_SPACE to something like 5GB. This
way you give the filesystem cache more space. Lucene (The underlying search
library the ES uses) depends a lot on the filesystem cache to execute
queres. The more space is available in the filesystem cache the more Lucene
index files end up in it and this will result in faster queries.

On 5 June 2013 21:19, Han hradusumalli@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonymous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we are doing
the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonymous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
https://gist.github.com/**anonymous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 8:14pm

Will try that. but do you think its due to having not enough memory for
Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han <hradus...@gmail.com <javascript:>> wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonymous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we are
doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonymous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
https://gist.github.com/**anonymous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · June 5, 2013, 8:25pm

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradusumalli@gmail.com wrote:

Will try that. but do you think its due to having not enough memory for
Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym**ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we are
doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym**ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this with
little bit of changes.
https://gist.github.com/**anonym**ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 8:40pm

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han <hradus...@gmail.com <javascript:>> wrote:

Will try that. but do you think its due to having not enough memory for
Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym**ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we are
doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym**ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this
with little bit of changes.
https://gist.github.com/**anonym**ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable term
vectors ("term_vector" : "with_positions_offsets") for these fields. This
will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mattweber · June 5, 2013, 8:54pm

Is that OpenJDK? If yes, you should give the latest official Oracle JDK 7
a try. There have been quite a few issues like this popping up and the
common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradusumalli@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory for
Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym****ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we are
doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym****ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this
with little bit of changes.
https://gist.github.com/**anonym****ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable
term vectors ("term_vector" : "with_positions_offsets") for these fields.
This will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 8:57pm

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle JDK 7
a try. There have been quite a few issues like this popping up and the
common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han <hradus...@gmail.com <javascript:>>wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory for
Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym****ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we
are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym****ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this
with little bit of changes.
https://gist.github.com/**anonym****ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable
term vectors ("term_vector" : "with_positions_offsets") for these fields.
This will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mattweber · June 5, 2013, 9:06pm

OK, that version number looked weird to me. What does the output of jvm
nodes info?

curl -XGET 'http://localhost:9200/_nodes/jvm?pretty=true'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradusumalli@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle JDK
7 a try. There have been quite a few issues like this popping up and the
common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory
for Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the allocated
heap. I assume you're not using faceting, script or sorting by a field,
right? If that is the case I'd lower the ES_HEAP_SPACE to something like
5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we
are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can you

perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this
with little bit of changes.
https://gist.github.com/**anonym******ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also using
facets (which is not case here), I would recommend putting all filters in
the filtered query. Also if you upgrade to version 0.90.1 I would use the
bool filter over the and, or and not filter in your case. This will
most likely execute your query in a more efficient manner. In 0.90.0 there
is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable
term vectors ("term_vector" : "with_positions_offsets") for these fields.
This will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 5, 2013, 9:19pm

output jvm nodes info

gist: nodes jvm · GitHub

"jvm" : {

    "pid" : 8064,

    "version" :

"1.7.0_21",

    "vm_name" :

"Java HotSpot(TM) 64-Bit Server VM",

    "vm_version"

: "23.21-b01",

    "vm_vendor" :

"Oracle Corporation",

    "start_time"

: 1370391059981,

    "mem" : {

"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,

      "heap_max"

: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112

    }

  }

On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of jvm
nodes info?

curl -XGET 'http://localhost:9200/_nodes/jvm?pretty=true'

On Wed, Jun 5, 2013 at 1:57 PM, Han <hradus...@gmail.com <javascript:>>wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle JDK
7 a try. There have been quite a few issues like this popping up and the
common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big heap
space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory
for Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we
are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we can
upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can

you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like this
with little bit of changes.
https://gist.github.com/**anonym******ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable
term vectors ("term_vector" : "with_positions_offsets") for these fields.
This will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 6, 2013, 5:20pm

Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:

output jvm nodes info

gist: nodes jvm · GitHub

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of jvm
nodes info?

curl -XGET 'http://localhost:9200/_nodes/jvm?pretty=true'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle
JDK 7 a try. There have been quite a few issues like this popping up and
the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen wrote:

I'm not 100% sure, but in general it is a waste to allocate a big
heap space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory
for Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that we
are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we
can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v Groningen
wrote:

What kind of searches are you executing? If possible, can

you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like
this with little bit of changes.
https://gist.github.com/**anonym******ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe enable
term vectors ("term_vector" : "with_positions_offsets") for these fields.
This will make your index larger, but highlighting will be much faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 6, 2013, 5:33pm

one more observation: when we changed the load from 250 rps to 200 rps cpu
went down to 75% on avg.

On Thursday, June 6, 2013 10:20:48 AM UTC-7, Han wrote:

Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:
output jvm nodes info

gist: nodes jvm · GitHub

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of jvm
nodes info?

curl -XGET 'http://localhost:9200/_nodes/jvm?pretty=true'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle
JDK 7 a try. There have been quite a few issues like this popping up and
the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen
wrote:

I'm not 100% sure, but in general it is a waste to allocate a big
heap space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough memory
for Lucene file system cache? we have a total of 64gb memory on each and we
have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that
we are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we
can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym******ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v
Groningen wrote:

What kind of searches are you executing? If possible, can

you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like
this with little bit of changes.
https://gist.github.com/**anonym******ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe
enable term vectors ("term_vector" : "with_positions_offsets") for these
fields. This will make your index larger, but highlighting will be much
faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · June 6, 2013, 7:31pm

Was this with or without changing the filters in your search requests and
lowering the ES_HEAP_SIZE?

You're also having 4 replicas shards, so in total 5 copies of your data.
Each shard (primary and replica) takes up resources. A single shard can
take all your system resources if it requires so. For this reason just
having 4 replica shards doesn't help improving read performance on its own,
you also need to have the machines for this. High availability can just be
achieved by having 1 replica per shard. So maybe just set
index.number_of_replicas to 1?

On 6 June 2013 19:33, Han hradusumalli@gmail.com wrote:

one more observation: when we changed the load from 250 rps to 200 rps cpu
went down to 75% on avg.

On Thursday, June 6, 2013 10:20:48 AM UTC-7, Han wrote:
Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:
output jvm nodes info

gist: https://gist.github.com/**anonymous/5717413 https://gist.github.com/anonymous/5717413

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of
jvm nodes info?

curl -XGET 'http://localhost:9200/_nodes/**jvm?pretty=true http://localhost:9200/_nodes/jvm?pretty=true
'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle
JDK 7 a try. There have been quite a few issues like this popping up and
the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default sorting
provided by ES (which is sort by the score), on very few queries we do have
sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen
wrote:

I'm not 100% sure, but in general it is a waste to allocate a big
heap space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough
memory for Lucene file system cache? we have a total of 64gb memory on each
and we have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that
we are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we
can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v
Groningen wrote:

What kind of searches are you executing? If possible, can

you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like
this with little bit of changes.
https://gist.github.com/**anonym********ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe
enable term vectors ("term_vector" : "with_positions_offsets") for these
fields. This will make your index larger, but highlighting will be much
faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@**googlegroups.**com.

For more options, visit https://groups.google.com/**grou**ps/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.**com.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 6, 2013, 8:33pm

Thanks for the response Martijn, please see my response inline.

On Thursday, June 6, 2013 12:31:16 PM UTC-7, Martijn v Groningen wrote:

Was this with or without changing the filters in your search requests and
lowering the ES_HEAP_SIZE?

cpu at 75% avg is when we reduced the rps from 250 to 200. regardless of
ES_HEAP_SIZE. Also, we did not change the filters in high level query since
we need to boost the results based on filter criteria. As you can see we
have two sets of filters (1. for boosting the score 2. filtering out
documents).

You're also having 4 replicas shards, so in total 5 copies of your data.
Each shard (primary and replica) takes up resources. A single shard can
take all your system resources if it requires so. For this reason just
having 4 replica shards doesn't help improving read performance on its own,
you also need to have the machines for this. High availability can just be
achieved by having 1 replica per shard. So maybe just set
index.number_of_replicas to 1?

We can give it a try, but with 4 replica shards it became easier for us to
take down a node and add it back since the data is replicated on every
server. Also, with 1 replica we have noticed the load is not properly
distributed on all servers and since we have really beefy machines we
thought it does not hurt to keep 4 replicas. We can give it a try. Is there
any guideline on how many replicas are needed?

On 6 June 2013 19:33, Han <hradus...@gmail.com <javascript:>> wrote:

one more observation: when we changed the load from 250 rps to 200 rps
cpu went down to 75% on avg.

On Thursday, June 6, 2013 10:20:48 AM UTC-7, Han wrote:
Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:
output jvm nodes info

gist: https://gist.github.com/**anonymous/5717413 https://gist.github.com/anonymous/5717413

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of
jvm nodes info?

curl -XGET 'http://localhost:9200/_nodes/**jvm?pretty=true http://localhost:9200/_nodes/jvm?pretty=true
'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official Oracle
JDK 7 a try. There have been quite a few issues like this popping up and
the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default
sorting provided by ES (which is sort by the score), on very few queries we
do have sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen
wrote:

I'm not 100% sure, but in general it is a waste to allocate a big
heap space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough
memory for Lucene file system cache? we have a total of 64gb memory on each
and we have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in our
queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v Groningen
wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields that
we are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if we
can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v
Groningen wrote:

What kind of searches are you executing? If possible,

can you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like
this with little bit of changes.
https://gist.github.com/**anonym********ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe
enable term vectors ("term_vector" : "with_positions_offsets") for these
fields. This will make your index larger, but highlighting will be much
faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@**googlegroups.**com.

For more options, visit https://groups.google.com/**grou**
ps/opt_out https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.**com.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.
--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

simonw_2 · June 7, 2013, 7:55pm

han,

I wonder what is wrong with 100% CPU? I mean your servers are doing work
though which is good are you having any performance problems? I also
wonder if you did any custom settings on the thread pool sizes?

simon

On Thursday, June 6, 2013 10:33:21 PM UTC+2, Han wrote:

Thanks for the response Martijn, please see my response inline.

On Thursday, June 6, 2013 12:31:16 PM UTC-7, Martijn v Groningen wrote:

Was this with or without changing the filters in your search requests and
lowering the ES_HEAP_SIZE?

cpu at 75% avg is when we reduced the rps from 250 to 200. regardless of
ES_HEAP_SIZE. Also, we did not change the filters in high level query since
we need to boost the results based on filter criteria. As you can see we
have two sets of filters (1. for boosting the score 2. filtering out
documents).

You're also having 4 replicas shards, so in total 5 copies of your data.
Each shard (primary and replica) takes up resources. A single shard can
take all your system resources if it requires so. For this reason just
having 4 replica shards doesn't help improving read performance on its own,
you also need to have the machines for this. High availability can just be
achieved by having 1 replica per shard. So maybe just set
index.number_of_replicas to 1?

We can give it a try, but with 4 replica shards it became easier for us to
take down a node and add it back since the data is replicated on every
server. Also, with 1 replica we have noticed the load is not properly
distributed on all servers and since we have really beefy machines we
thought it does not hurt to keep 4 replicas. We can give it a try. Is there
any guideline on how many replicas are needed?

On 6 June 2013 19:33, Han hradus...@gmail.com wrote:
one more observation: when we changed the load from 250 rps to 200 rps
cpu went down to 75% on avg.

On Thursday, June 6, 2013 10:20:48 AM UTC-7, Han wrote:
Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:
output jvm nodes info

gist: https://gist.github.com/**anonymous/5717413 https://gist.github.com/anonymous/5717413

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of
jvm nodes info?

curl -XGET 'http://localhost:9200/_nodes/**jvm?pretty=true http://localhost:9200/_nodes/jvm?pretty=true
'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official
Oracle JDK 7 a try. There have been quite a few issues like this popping
up and the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default
sorting provided by ES (which is sort by the score), on very few queries we
do have sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen
wrote:

I'm not 100% sure, but in general it is a waste to allocate a big
heap space and not use it. While the actual memory can be used if it is not
allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough
memory for Lucene file system cache? we have a total of 64gb memory on each
and we have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in
our queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v
Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields
that we are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if
we can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v
Groningen wrote:

What kind of searches are you executing? If possible,

can you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are like
this with little bit of changes.
https://gist.github.com/**anonym********ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are also
using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe
enable term vectors ("term_vector" : "with_positions_offsets") for these
fields. This will make your index larger, but highlighting will be much
faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@**googlegroups.**com.

For more options, visit https://groups.google.com/**grou**
ps/opt_out https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.**com.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.
--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Han_2 · June 7, 2013, 10:12pm

Simon, thank you for your response.

The problem with 100% cpu is, with a 5 node cluster we should be able to
take a node out for maintenance without affecting cluster health. With 5
nodes in cluster if the cpu is at 100%, we are not sure if we can take down
one of the nodes.. just wanted to be on the safer side and keep the cpu
lower.

We did not do any custom settings on the thread pool sizes.. but we do
notice quite a lot threads open on each node

this is from node stats api:

"threads" : {
"count" : 277,
"peak_count" : 280
}

On Friday, June 7, 2013 12:55:03 PM UTC-7, simonw wrote:

han,

I wonder what is wrong with 100% CPU? I mean your servers are doing work
though which is good are you having any performance problems? I also
wonder if you did any custom settings on the thread pool sizes?

simon

On Thursday, June 6, 2013 10:33:21 PM UTC+2, Han wrote:
Thanks for the response Martijn, please see my response inline.

On Thursday, June 6, 2013 12:31:16 PM UTC-7, Martijn v Groningen wrote:

Was this with or without changing the filters in your search requests
and lowering the ES_HEAP_SIZE?

cpu at 75% avg is when we reduced the rps from 250 to 200. regardless of
ES_HEAP_SIZE. Also, we did not change the filters in high level query since
we need to boost the results based on filter criteria. As you can see we
have two sets of filters (1. for boosting the score 2. filtering out
documents).

You're also having 4 replicas shards, so in total 5 copies of your data.
Each shard (primary and replica) takes up resources. A single shard can
take all your system resources if it requires so. For this reason just
having 4 replica shards doesn't help improving read performance on its own,
you also need to have the machines for this. High availability can just be
achieved by having 1 replica per shard. So maybe just set
index.number_of_replicas to 1?

We can give it a try, but with 4 replica shards it became easier for us
to take down a node and add it back since the data is replicated on every
server. Also, with 1 replica we have noticed the load is not properly
distributed on all servers and since we have really beefy machines we
thought it does not hurt to keep 4 replicas. We can give it a try. Is there
any guideline on how many replicas are needed?

On 6 June 2013 19:33, Han hradus...@gmail.com wrote:
one more observation: when we changed the load from 250 rps to 200 rps
cpu went down to 75% on avg.

On Thursday, June 6, 2013 10:20:48 AM UTC-7, Han wrote:
Can any one help us with cpu spike issue?

On Wednesday, June 5, 2013 2:19:55 PM UTC-7, Han wrote:
output jvm nodes info

gist: https://gist.github.com/**anonymous/5717413 https://gist.github.com/anonymous/5717413

"jvm" : {
    "pid" : 8064,

    "version" :
"1.7.0_21",
    "vm_name" :
"Java HotSpot(TM) 64-Bit Server VM",
    "vm_version"
: "23.21-b01",
    "vm_vendor" :
"Oracle Corporation",
    "start_time"
: 1370391059981,
    "mem" : {
"heap_init" : "32gb",

"heap_init_in_bytes" : 34359738368,
      "heap_max"
: "31.8gb",

"heap_max_in_bytes" : 34202714112,

"non_heap_init" : "23.1mb",

"non_heap_init_in_bytes" : 24313856,

"non_heap_max" : "130mb",

"non_heap_max_in_bytes" : 136314880,

"direct_max" : "31.8gb",

"direct_max_in_bytes" : 34202714112
    }

  }
On Wednesday, June 5, 2013 2:06:14 PM UTC-7, Matt Weber wrote:

OK, that version number looked weird to me. What does the output of
jvm nodes info?

curl -XGET 'http://localhost:9200/_nodes/**jvm?pretty=true http://localhost:9200/_nodes/jvm?pretty=true
'

On Wed, Jun 5, 2013 at 1:57 PM, Han hradus...@gmail.com wrote:

Nope. its official Oracle JDK 7.

On Wednesday, June 5, 2013 1:54:24 PM UTC-7, Matt Weber wrote:

Is that OpenJDK? If yes, you should give the latest official
Oracle JDK 7 a try. There have been quite a few issues like this popping
up and the common theme seems to be OpenJDK.

On Wed, Jun 5, 2013 at 1:40 PM, Han hradus...@gmail.com wrote:

We are using version 7.0.210.11

Regarding sorting, most of the time we are using the default
sorting provided by ES (which is sort by the score), on very few queries we
do have sorting based on a couple of numeric fields.

On Wednesday, June 5, 2013 1:25:58 PM UTC-7, Martijn v Groningen
wrote:

I'm not 100% sure, but in general it is a waste to allocate a
big heap space and not use it. While the actual memory can be used if it is
not allocated to ES. I also expect garbage collections to be faster with
smaller jvm's. Btw what Java version are you using?

Are you sorting by score or a field?

On 5 June 2013 22:14, Han hradus...@gmail.com wrote:

Will try that. but do you think its due to having not enough
memory for Lucene file system cache? we have a total of 64gb memory on each
and we have allocated half of it (32gb) to ES_HEAP_SPACE.

We do not have faceting or script but we do have "sorting" in
our queries.

On Wednesday, June 5, 2013 12:59:24 PM UTC-7, Martijn v
Groningen wrote:

The actual heap usage (at most 3.9GB) is way lower than the
allocated heap. I assume you're not using faceting, script or sorting by a
field, right? If that is the case I'd lower the ES_HEAP_SPACE to something
like 5GB. This way you give the filesystem cache more space. Lucene (The
underlying search library the ES uses) depends a lot on the filesystem
cache to execute queres. The more space is available in the filesystem
cache the more Lucene index files end up in it and this will result in
faster queries.

On 5 June 2013 21:19, Han hradus...@gmail.com wrote:

gist on heal usage
here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

also, we have already enabled term vectors on the fields
that we are doing the highlights.

On Wednesday, June 5, 2013 11:46:05 AM UTC-7, Han wrote:

Thanks Martin.. i will look at the "bool" filter and see if
we can upgrade to 0.90.1, i will keep you posted.

here is the gist heap usage
https://gist.github.com/**anonym********ous/5716142 https://gist.github.com/anonymous/5716142

let me know if you notice anything weird..

On Wednesday, June 5, 2013 11:40:23 AM UTC-7, Martijn v
Groningen wrote:

What kind of searches are you executing? If possible,

can you perhaps them share examples of your queries via a gist?

Here is the gist of our query, most of our queries are
like this with little bit of changes.
https://gist.github.com/**anonym********ous/5715994 https://gist.github.com/anonymous/5715994

I see that you use the top level filter. Unless you are
also using facets (which is not case here), I would recommend putting all
filters in the filtered query. Also if you upgrade to version 0.90.1 I
would use the bool filter over the and, or and not filter in your
case. This will most likely execute your query in a more efficient manner.
In 0.90.0 there is a bug in the bool filter.

Are you highlighting on large fields? If so I would maybe
enable term vectors ("term_vector" : "with_positions_offsets") for these
fields. This will make your index larger, but highlighting will be much
faster.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@**googlegroups.**com.

For more options, visit https://groups.google.com/**grou**
ps/opt_out https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.**com.
For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.
--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Elasticsearch high load/CPU usage Elasticsearch	10	9574	July 6, 2017
Periodic CPU spikes Elasticsearch	9	2909	July 6, 2017
Elasticsearch full CPU utillization Elasticsearch	2	840	July 6, 2017
High cpu usage 100% on elastic search servers Elasticsearch	6	1888	February 8, 2021
Cluster locks up Elasticsearch	9	1669	July 6, 2017

High cpu usage (90-100%) on elastic search servers

Related topics