How to improve the cluster's perfomance for the JVM Out of Memory Exception?

Hi Everyone,

We face performance issue(JVM: Out of memory Exception) when faceting a
huge index data. We tried moving from Array Type to String Type of the
fields which are taken for faceting, still of no improvement. I would like
to know if we are using the current Cluster machines to the Max efficiency

We face a performance issue with the current ES configuration. We have 2
server machines in cluster, each with below configurations...

RAM:128 GB
Index data are in DISK: RAID 5 ->7200rpm
Processor:24 Core processor

ES Configuration:
JVM MAX MEM:16GB(remaining of RAM is unused now)
Total Index Size:850GB

In another thread where kimchy answered to increase the JVM to overcome the
exception. How much can I increase for the current machines'
configuration. Is thr a need of new cluster for this case.

Also, I am just interested to know the MAX efficiency that could be pulled
out of a single cluster....

Please let me know if any more info I need to provide for deciding up on
this....

Thanks a lot!
Manoj

Kimchy, any information about this?

Really expecting to know the realization behind the scenes...

On Tuesday, May 15, 2012 6:55:34 PM UTC+5:30, Manoj wrote:

Hi Everyone,

We face performance issue(JVM: Out of memory Exception) when faceting a
huge index data. We tried moving from Array Type to String Type of the
fields which are taken for faceting, still of no improvement. I would like
to know if we are using the current Cluster machines to the Max efficiency

We face a performance issue with the current ES configuration. We have 2
server machines in cluster, each with below configurations...

RAM:128 GB
Index data are in DISK: RAID 5 ->7200rpm
Processor:24 Core processor

ES Configuration:
JVM MAX MEM:16GB(remaining of RAM is unused now)
Total Index Size:850GB

In another thread where kimchy answered to increase the JVM to overcome
the exception. How much can I increase for the current machines'
configuration. Is thr a need of new cluster for this case.

Also, I am just interested to know the MAX efficiency that could be pulled
out of a single cluster....

Please let me know if any more info I need to provide for deciding up on
this....

Thanks a lot!
Manoj

I can't seem to find it at the moment, but there was another post where
kimchy mentions that performance seriously degrades if elasticsearch starts
paging. Since you seem to have a lot of unused RAM, you might try using
the rest. Also, what time of performance are you trying to optimize,
search, filter, percolate?? Depending on what you're doing there are a lot
of settings to play with.

On Wednesday, May 16, 2012 2:36:17 AM UTC-4, Manoj wrote:

Kimchy, any information about this?

Really expecting to know the realization behind the scenes...

On Tuesday, May 15, 2012 6:55:34 PM UTC+5:30, Manoj wrote:

Hi Everyone,

We face performance issue(JVM: Out of memory Exception) when faceting a
huge index data. We tried moving from Array Type to String Type of the
fields which are taken for faceting, still of no improvement. I would like
to know if we are using the current Cluster machines to the Max efficiency

We face a performance issue with the current ES configuration. We have 2
server machines in cluster, each with below configurations...

RAM:128 GB
Index data are in DISK: RAID 5 ->7200rpm
Processor:24 Core processor

ES Configuration:
JVM MAX MEM:16GB(remaining of RAM is unused now)
Total Index Size:850GB

In another thread where kimchy answered to increase the JVM to overcome
the exception. How much can I increase for the current machines'
configuration. Is thr a need of new cluster for this case.

Also, I am just interested to know the MAX efficiency that could be
pulled out of a single cluster....

Please let me know if any more info I need to provide for deciding up on
this....

Thanks a lot!
Manoj

You can dedicate a lot more memory to ES JVM. Typically you can dedicate
half of the physical memory. So you can try 64GB, or even more if you still
experience problems.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Tue, May 15, 2012 at 9:25 AM, Manoj manokrrish@gmail.com wrote:

Hi Everyone,

We face performance issue(JVM: Out of memory Exception) when faceting a
huge index data. We tried moving from Array Type to String Type of the
fields which are taken for faceting, still of no improvement. I would like
to know if we are using the current Cluster machines to the Max efficiency

We face a performance issue with the current ES configuration. We have 2
server machines in cluster, each with below configurations...

RAM:128 GB
Index data are in DISK: RAID 5 ->7200rpm
Processor:24 Core processor

ES Configuration:
JVM MAX MEM:16GB(remaining of RAM is unused now)
Total Index Size:850GB

In another thread where kimchy answered to increase the JVM to overcome
the exception. How much can I increase for the current machines'
configuration. Is thr a need of new cluster for this case.

Also, I am just interested to know the MAX efficiency that could be pulled
out of a single cluster....

Please let me know if any more info I need to provide for deciding up on
this....

Thanks a lot!
Manoj

Thanks for the reply shadow000fire ...

We are requiring more Physical Memory when use Faceting on a field which
has a lot of Tag kind of data. Unless we do this Faceting everything goes
fine with the current RAM. But we are in need of the feature at present.
Also I want to ask u this...

Assume the faceting of the particular field requires 64GB to work fine for
a single instance of search, does it mean, it require the multiples of the
same size, if we increase the concurrent number of users?(ie. 128GB RAM for
2 concurrent search, etc..)

Is there any more optimization I need to do in ES settings wise, for the
situation we face?

On Wednesday, May 16, 2012 8:16:44 PM UTC+5:30, shadow000fire wrote:

I can't seem to find it at the moment, but there was another post where
kimchy mentions that performance seriously degrades if elasticsearch starts
paging. Since you seem to have a lot of unused RAM, you might try using
the rest. Also, what time of performance are you trying to optimize,
search, filter, percolate?? Depending on what you're doing there are a lot
of settings to play with.

On Wednesday, May 16, 2012 2:36:17 AM UTC-4, Manoj wrote:

Kimchy, any information about this?

Really expecting to know the realization behind the scenes...

On Tuesday, May 15, 2012 6:55:34 PM UTC+5:30, Manoj wrote:

Hi Everyone,

We face performance issue(JVM: Out of memory Exception) when faceting a
huge index data. We tried moving from Array Type to String Type of the
fields which are taken for faceting, still of no improvement. I would like
to know if we are using the current Cluster machines to the Max efficiency

We face a performance issue with the current ES configuration. We have
2 server machines in cluster, each with below configurations...

RAM:128 GB
Index data are in DISK: RAID 5 ->7200rpm
Processor:24 Core processor

ES Configuration:
JVM MAX MEM:16GB(remaining of RAM is unused now)
Total Index Size:850GB

In another thread where kimchy answered to increase the JVM to overcome
the exception. How much can I increase for the current machines'
configuration. Is thr a need of new cluster for this case.

Also, I am just interested to know the MAX efficiency that could be
pulled out of a single cluster....

Please let me know if any more info I need to provide for deciding up on
this....

Thanks a lot!
Manoj

I've been facing the same issue and here is what I think I know and what
I've done. Would love corrections or other peoples experiences. :slight_smile:

Facet memory usage is determined by the number of documents, size of the
field, and if using multi value the max number of values in a single field.
Using jvisualvm it appears almost all the memory is int arrays, which I
think are the ordinals arrays being passed around everywhere. It seems
like for memory reduction switching from arrays to sparse arrays might
help, but that is a Kimchy question, and would probably hurt performance.

What I've done to reduce memory:

  • Disable _all since I don't need it ("_all" : {"enabled" : false})
  • Enable _source compression ("_source" : {"compress" : true})
  • Enable soft cache, which seems to make the process run longer before
    hitting OOM at the cost of slow queries. Since OOMs are evil this is a ok
    thing, just use bigdesk to see how much more memory you need.
    ("index.cache.field.type" : "soft")
  • Turn off replication ("index.number_of_replicas" : 0)
  • Only have 1 or 2 shards per node
  • Make sure the shards are evenly distributed
    ("index.routing.allocation.total_shards_per_node" : 1 (or 2))
  • If using string facets make sure the mapping index is "not_analyzed"
  • I switched from string tags to short tags, with a separate index/type
    that has the string->short conversion. I create the conversion on the fly
    using the _version technique of yet another index/type. ( See
    http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html)
  • Since I'm using multi value shorts I reduced the max number of tags
    per document which really helped.
  • Switched to _bulk indexing
  • Lowered my refresh interval ("index.refresh_interval" : 60)

For your situation you should definitely give elastic more memory. I would
be tempted to run 3 or more 20G-30G nodes on the 128G machine to help
spread out the GC pauses. (If using linux with multi nodes per machine,
don't use mlockall unless you have a 3.x kernel.) Although it seems multi
node vs single huge node hasn't been proven yet. I'm about to switch to
64G machines and I'm going to be trying 2x20G nodes per machine. You
definitely want to leave some free memory for the disk cache, the "free"
command on linux shows what is currently being used in the cached column.

Thanks,
Andy

@Andy just want to pick your brain about the single huge node vs multiple
smaller nodes tradeoff and how did it work for you. Anybody else watching
this thread. Your feedback is highly appreciated.

On Thursday, May 17, 2012 4:58:43 PM UTC+4, Andy Wick wrote:

I've been facing the same issue and here is what I think I know and what
I've done. Would love corrections or other peoples experiences. :slight_smile:

Facet memory usage is determined by the number of documents, size of the
field, and if using multi value the max number of values in a single field.
Using jvisualvm it appears almost all the memory is int arrays, which I
think are the ordinals arrays being passed around everywhere. It seems
like for memory reduction switching from arrays to sparse arrays might
help, but that is a Kimchy question, and would probably hurt performance.

What I've done to reduce memory:

  • Disable _all since I don't need it ("_all" : {"enabled" : false})
  • Enable _source compression ("_source" : {"compress" : true})
  • Enable soft cache, which seems to make the process run longer before
    hitting OOM at the cost of slow queries. Since OOMs are evil this is a ok
    thing, just use bigdesk to see how much more memory you need.
    ("index.cache.field.type" : "soft")
  • Turn off replication ("index.number_of_replicas" : 0)
  • Only have 1 or 2 shards per node
  • Make sure the shards are evenly distributed
    ("index.routing.allocation.total_shards_per_node" : 1 (or 2))
  • If using string facets make sure the mapping index is "not_analyzed"
  • I switched from string tags to short tags, with a separate index/type
    that has the string->short conversion. I create the conversion on the fly
    using the _version technique of yet another index/type. ( See
    ElasticSearch::Sequence - a blazing fast ticket server | Clinton Gormley [blogs.perl.org]
    )
  • Since I'm using multi value shorts I reduced the max number of tags
    per document which really helped.
  • Switched to _bulk indexing
  • Lowered my refresh interval ("index.refresh_interval" : 60)

For your situation you should definitely give elastic more memory. I
would be tempted to run 3 or more 20G-30G nodes on the 128G machine to help
spread out the GC pauses. (If using linux with multi nodes per machine,
don't use mlockall unless you have a 3.x kernel.) Although it seems multi
node vs single huge node hasn't been proven yet. I'm about to switch to
64G machines and I'm going to be trying 2x20G nodes per machine. You
definitely want to leave some free memory for the disk cache, the "free"
command on linux shows what is currently being used in the cached column.

Thanks,
Andy

--

  • Lowered my refresh interval ("index.refresh_interval" : 60)

Just a quick note - it appears that the default unit for refresh_interval, if you don't specify one, is milliseconds. We ran our cluster for a short while with a rather more frequent refresh_interval than we wanted!

Cheers,
Dan

Dan Fairs | dan.fairs@gmail.com | @danfairs | secondsync.com

--