Memory requirements and settings

I'm using ElasticSearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

Can you gist your search request? Are you using facets when you search?
Nothing that should affect this has changed between 0.17.6 and 0.17.7, so
its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.com wrote:

I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

This is using the Logstash UI so I'm trying to figure out how I can see what the search parameters are. I only see this in the log, not sure if that is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11 minutes (and after that the ES cluster is not happy at all, I don't see the indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB data) when it worked, but after that I added the second node, which then uses 17.7, so I figured that caused the issues so I upgraded the master to 17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search? Nothing that should affect this has changed between 0.17.6 and 0.17.7, so its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

How big is the dataset now? How many indices do you have? You might just
have too much data to hold with the memory you allocate to it. Note that
adding another node, by default, does not really provide you with more
"resources" for the index, since by default, the index shards are replicated
(with 1 replica), so if you start another node, the replicas will be
allocated on that node.

If you want, there should be a heap dump file when an out of memory failure
occurs. Should be in the working directory of the process (might be under
bin, or the ES_HOME), The file should be called something like heap.bin. You
can compress it and provide me a link to download it (dropbox?), I can take
a look at whats taking all the memory.

On Thu, Oct 6, 2011 at 12:57 AM, Hakan Lindestaf hakan@lindestaf.comwrote:

This is using the Logstash UI so I'm trying to figure out how I can see
what the search parameters are. I only see this in the log, not sure if that
is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch
search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11
minutes (and after that the ES cluster is not happy at all, I don't see the
indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when
searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB
data) when it worked, but after that I added the second node, which then
uses 17.7, so I figured that caused the issues so I upgraded the master to
17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search?
Nothing that should affect this has changed between 0.17.6 and 0.17.7, so
its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.comwrote:

I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

I have about 5-10 GB per day, and one index per day, and I had about 8-9 days, so about 70 GB of data, 80M documents. I may have too much data, but I'm curious how I can scale this, in other words, how much memory should I have to support 10 times this amount as an example? I somehow figured ES would still run with less memory, just not as quick. I was also surprised that the ES-Head Browser shows me a list of all 80M documents in sub-second responses (paged, but still the count is there). That's the same way as Logstash does, but it took 11 minutes in the last query.
I realize showing the queries would really help out, I just can't figure out any logging settings to show it. I also tried to capture network traffic but I still can't make out the queries. Any tips?

Thanks,
/Hakan

On Oct 6, 2011, at 7:52 AM, Shay Banon wrote:

How big is the dataset now? How many indices do you have? You might just have too much data to hold with the memory you allocate to it. Note that adding another node, by default, does not really provide you with more "resources" for the index, since by default, the index shards are replicated (with 1 replica), so if you start another node, the replicas will be allocated on that node.

If you want, there should be a heap dump file when an out of memory failure occurs. Should be in the working directory of the process (might be under bin, or the ES_HOME), The file should be called something like heap.bin. You can compress it and provide me a link to download it (dropbox?), I can take a look at whats taking all the memory.

On Thu, Oct 6, 2011 at 12:57 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
This is using the Logstash UI so I'm trying to figure out how I can see what the search parameters are. I only see this in the log, not sure if that is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11 minutes (and after that the ES cluster is not happy at all, I don't see the indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB data) when it worked, but after that I added the second node, which then uses 17.7, so I figured that caused the issues so I upgraded the master to 17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search? Nothing that should affect this has changed between 0.17.6 and 0.17.7, so its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

Regarding how fast es-head works compared to logstash, I don't really know.
If its the same simple range based query, then it should be fast. Maybe you
just queries a single index?

You might be overloading the system a bit. For that kind of data, I suggest
using indices that have 1 shard (and 1 replica, for HA). Its more than
enough (assuming you create an index per day). Another option is to move to
an index per week. Less shards per node means less memory usage (since each
shard is a Lucene index). You can set the number of shards and repicas in
the config of elasticsearch, and it will automatically apply to new indices
created.

In terms of scaling out even further, you just need to add more nodes. 2
nodes with 1 replica means that the second node is just acting as "backup"
(though its more dynamic then that), if you add another node, shards will
automatically start to move around and balance.

Another option that you have is to use 0 replicas. In this case, a second
node will mean balancing shards across two nodes (since there are no
replicas). Its still "safe" in terms of data not being lost, as long as you
don't loose the drive of the machine (or willing to live with it). Its ok in
this case for the machine to come down, and then start it back up on the
same data location, in which period, while its down, you will simply get
results from the "active" shards on the other nodes.

On Thu, Oct 6, 2011 at 8:26 PM, Hakan Lindestaf hakan@lindestaf.com wrote:

I have about 5-10 GB per day, and one index per day, and I had about 8-9
days, so about 70 GB of data, 80M documents. I may have too much data, but
I'm curious how I can scale this, in other words, how much memory should I
have to support 10 times this amount as an example? I somehow figured ES
would still run with less memory, just not as quick. I was also surprised
that the ES-Head Browser shows me a list of all 80M documents in sub-second
responses (paged, but still the count is there). That's the same way as
Logstash does, but it took 11 minutes in the last query.
I realize showing the queries would really help out, I just can't figure
out any logging settings to show it. I also tried to capture network traffic
but I still can't make out the queries. Any tips?

Thanks,
/Hakan

On Oct 6, 2011, at 7:52 AM, Shay Banon wrote:

How big is the dataset now? How many indices do you have? You might just
have too much data to hold with the memory you allocate to it. Note that
adding another node, by default, does not really provide you with more
"resources" for the index, since by default, the index shards are replicated
(with 1 replica), so if you start another node, the replicas will be
allocated on that node.

If you want, there should be a heap dump file when an out of memory failure
occurs. Should be in the working directory of the process (might be under
bin, or the ES_HOME), The file should be called something like heap.bin. You
can compress it and provide me a link to download it (dropbox?), I can take
a look at whats taking all the memory.

On Thu, Oct 6, 2011 at 12:57 AM, Hakan Lindestaf hakan@lindestaf.comwrote:

This is using the Logstash UI so I'm trying to figure out how I can see
what the search parameters are. I only see this in the log, not sure if that
is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch
search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11
minutes (and after that the ES cluster is not happy at all, I don't see the
indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when
searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB
data) when it worked, but after that I added the second node, which then
uses 17.7, so I figured that caused the issues so I upgraded the master to
17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search?
Nothing that should affect this has changed between 0.17.6 and 0.17.7, so
its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.comwrote:

I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

Ah, I think that makes a lot of sense. I've changed it to 1 shard, weekly index and 0 replicas (I don't really care about HA at this point). I'll let it load up some data and see.

/Hakan

On Oct 6, 2011, at 11:58 AM, Shay Banon wrote:

Regarding how fast es-head works compared to logstash, I don't really know. If its the same simple range based query, then it should be fast. Maybe you just queries a single index?

You might be overloading the system a bit. For that kind of data, I suggest using indices that have 1 shard (and 1 replica, for HA). Its more than enough (assuming you create an index per day). Another option is to move to an index per week. Less shards per node means less memory usage (since each shard is a Lucene index). You can set the number of shards and repicas in the config of elasticsearch, and it will automatically apply to new indices created.

In terms of scaling out even further, you just need to add more nodes. 2 nodes with 1 replica means that the second node is just acting as "backup" (though its more dynamic then that), if you add another node, shards will automatically start to move around and balance.

Another option that you have is to use 0 replicas. In this case, a second node will mean balancing shards across two nodes (since there are no replicas). Its still "safe" in terms of data not being lost, as long as you don't loose the drive of the machine (or willing to live with it). Its ok in this case for the machine to come down, and then start it back up on the same data location, in which period, while its down, you will simply get results from the "active" shards on the other nodes.

On Thu, Oct 6, 2011 at 8:26 PM, Hakan Lindestaf hakan@lindestaf.com wrote:
I have about 5-10 GB per day, and one index per day, and I had about 8-9 days, so about 70 GB of data, 80M documents. I may have too much data, but I'm curious how I can scale this, in other words, how much memory should I have to support 10 times this amount as an example? I somehow figured ES would still run with less memory, just not as quick. I was also surprised that the ES-Head Browser shows me a list of all 80M documents in sub-second responses (paged, but still the count is there). That's the same way as Logstash does, but it took 11 minutes in the last query.
I realize showing the queries would really help out, I just can't figure out any logging settings to show it. I also tried to capture network traffic but I still can't make out the queries. Any tips?

Thanks,
/Hakan

On Oct 6, 2011, at 7:52 AM, Shay Banon wrote:

How big is the dataset now? How many indices do you have? You might just have too much data to hold with the memory you allocate to it. Note that adding another node, by default, does not really provide you with more "resources" for the index, since by default, the index shards are replicated (with 1 replica), so if you start another node, the replicas will be allocated on that node.

If you want, there should be a heap dump file when an out of memory failure occurs. Should be in the working directory of the process (might be under bin, or the ES_HOME), The file should be called something like heap.bin. You can compress it and provide me a link to download it (dropbox?), I can take a look at whats taking all the memory.

On Thu, Oct 6, 2011 at 12:57 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
This is using the Logstash UI so I'm trying to figure out how I can see what the search parameters are. I only see this in the log, not sure if that is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11 minutes (and after that the ES cluster is not happy at all, I don't see the indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB data) when it worked, but after that I added the second node, which then uses 17.7, so I figured that caused the issues so I upgraded the master to 17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search? Nothing that should affect this has changed between 0.17.6 and 0.17.7, so its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.

To follow up on this. I've been running with 1 shard/index, 0 replicas and weekly index rotation for a few days now and it's working perfectly! Thanks for the information and help!

/Hakan

On Oct 6, 2011, at 11:58 AM, Shay Banon wrote:

Regarding how fast es-head works compared to logstash, I don't really know. If its the same simple range based query, then it should be fast. Maybe you just queries a single index?

You might be overloading the system a bit. For that kind of data, I suggest using indices that have 1 shard (and 1 replica, for HA). Its more than enough (assuming you create an index per day). Another option is to move to an index per week. Less shards per node means less memory usage (since each shard is a Lucene index). You can set the number of shards and repicas in the config of elasticsearch, and it will automatically apply to new indices created.

In terms of scaling out even further, you just need to add more nodes. 2 nodes with 1 replica means that the second node is just acting as "backup" (though its more dynamic then that), if you add another node, shards will automatically start to move around and balance.

Another option that you have is to use 0 replicas. In this case, a second node will mean balancing shards across two nodes (since there are no replicas). Its still "safe" in terms of data not being lost, as long as you don't loose the drive of the machine (or willing to live with it). Its ok in this case for the machine to come down, and then start it back up on the same data location, in which period, while its down, you will simply get results from the "active" shards on the other nodes.

On Thu, Oct 6, 2011 at 8:26 PM, Hakan Lindestaf hakan@lindestaf.com wrote:
I have about 5-10 GB per day, and one index per day, and I had about 8-9 days, so about 70 GB of data, 80M documents. I may have too much data, but I'm curious how I can scale this, in other words, how much memory should I have to support 10 times this amount as an example? I somehow figured ES would still run with less memory, just not as quick. I was also surprised that the ES-Head Browser shows me a list of all 80M documents in sub-second responses (paged, but still the count is there). That's the same way as Logstash does, but it took 11 minutes in the last query.
I realize showing the queries would really help out, I just can't figure out any logging settings to show it. I also tried to capture network traffic but I still can't make out the queries. Any tips?

Thanks,
/Hakan

On Oct 6, 2011, at 7:52 AM, Shay Banon wrote:

How big is the dataset now? How many indices do you have? You might just have too much data to hold with the memory you allocate to it. Note that adding another node, by default, does not really provide you with more "resources" for the index, since by default, the index shards are replicated (with 1 replica), so if you start another node, the replicas will be allocated on that node.

If you want, there should be a heap dump file when an out of memory failure occurs. Should be in the working directory of the process (might be under bin, or the ES_HOME), The file should be called something like heap.bin. You can compress it and provide me a link to download it (dropbox?), I can take a look at whats taking all the memory.

On Thu, Oct 6, 2011 at 12:57 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
This is using the Logstash UI so I'm trying to figure out how I can see what the search parameters are. I only see this in the log, not sure if that is enough:

I, [2011-10-05T18:35:30.987000 #5468] INFO -- runner.class: Elasticsearch search: * @timestamp:[2011-09-28 TO 2011-10-07]
I, [2011-10-05T18:46:42.851000 #5468] INFO -- runner.class: [
[0] "Got search results (in blocking mode)",
[1] {
:query => "* @timestamp:[2011-09-28 TO 2011-10-07]",
:duration => "11.1m",
:result_count => 42214243
}
]

It was just a search on everything between 9/28 and 10/5 and it took 11 minutes (and after that the ES cluster is not happy at all, I don't see the indexes in ES-Head any more, etc).

I don't think Logstash uses facets, I don't see any in the UI when searching though.

The data size has increased, I probably only had 1-2 days (maybe 10 GB data) when it worked, but after that I added the second node, which then uses 17.7, so I figured that caused the issues so I upgraded the master to 17.6 as well. But it's still not happy.

/Hakan

On Oct 5, 2011, at 2:59 AM, Shay Banon wrote:

Can you gist your search request? Are you using facets when you search? Nothing that should affect this has changed between 0.17.6 and 0.17.7, so its strange. Did you dataset / how you search possibly changed?

On Wed, Oct 5, 2011 at 9:36 AM, Hakan Lindestaf hakan@lindestaf.com wrote:
I'm using Elasticsearch 0.17.7 on two nodes (2/5), each with 4 GB ram.
One of the servers run rabbitmq as well to receive messages that feed
into ES using the river plugin. I'm using default heap memory
settings, 256m/1g.

My problem is that I get java heap error when running queries against
the cluster. I have about 10 indices currently, between 6-10 GB data,
10M documents per index. I tried to set the max heap setting to 2 GB,
I could run 1-2 queries (I'm using logstash web ui, I just tried a
date range query) before the ES cluster became unavailable (no heap
error this time, I don't seem to get it every time). Parts of ES seems
to be running, but es-head shows that the cluster is down, and curl
searches never return.
I also tried the max heap setting at 512m, then I get heap errors
while starting ES.

It worked great in ES 0.17.6, these errors started to happen when I
added a 2nd node and upgraded to ES 0.17.7.

I'm not sure what info is needed to help troubleshoot this. Any
suggestions are appreciated.