Memory Usage

Hi Guys,

I have the following situation. I have 3 nodes cluster, each node has 32 GB
DDR3 RAM and 2 Octocore Processors (16 cores), with HT. Allocated 50% of
the RAM to Elasticsearch. We have only 1 index with 32 shards and 1 replica
each, making it 64 shards in the cluster. We have a lot of mappings in the
index (more than 3000 ~ no. of users).

We have a lot of updates queries. In fact, we have only update queries with
upserts. We are using routing to put similar data in the same shard. We see
the rate of update queries ranging between 20/sec to 60/sec. This is going
to increase to about 130-140/sec when we go live. Our queries are mostly
filtered queries with a lot of use of term faceting (now we are using the
Aggregations module).

We are using doc_values as much as possible to reduce bringing of field
data in the field data cache.

For the last 2 weeks, I have been noticing that the memory usage would
never drop. Every node in the cluster is using about 90% of the memory all
the time. This triggers the GC a lot (sometimes more than 5 times in a
minute). We are using G1GC instead of the default CMS.

I then decided to stop querying and indexing both. The memory usage still
does not drop and there are still a lot of GCs in a given minute. The
effect of this is visible in the queue we are using to update documents. I
can clearly relate the spikes in the queue with the GC activity i.e.
whenever GC happens, indexing seems to become slow. This is sort of
worrying me.

I am attaching screenshots of BigDesk of two of our nodes in the cluster (1
master data node and 1 non-master data node).

What could be the reason? How should I debug this issue? Will be happy to
share more details if needed.

Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5k%3Dy9%2BY93jFQUV%2Bp5TM9D-rS-2HF1cOqPPTEatFrwjMPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Facets use a fair bit of memory and in general ES will cache a much as it
can to speed up query times.

Your graphs look pretty reasonable to me, lots of little GC's rather than
big ones rarely is what you want.
What are you expecting?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 30 January 2014 18:43, Vaidik Kapoor kapoor.vaidik@gmail.com wrote:

Hi Guys,

I have the following situation. I have 3 nodes cluster, each node has 32
GB DDR3 RAM and 2 Octocore Processors (16 cores), with HT. Allocated 50% of
the RAM to Elasticsearch. We have only 1 index with 32 shards and 1 replica
each, making it 64 shards in the cluster. We have a lot of mappings in the
index (more than 3000 ~ no. of users).

We have a lot of updates queries. In fact, we have only update queries
with upserts. We are using routing to put similar data in the same shard.
We see the rate of update queries ranging between 20/sec to 60/sec. This is
going to increase to about 130-140/sec when we go live. Our queries are
mostly filtered queries with a lot of use of term faceting (now we are
using the Aggregations module).

We are using doc_values as much as possible to reduce bringing of field
data in the field data cache.

For the last 2 weeks, I have been noticing that the memory usage would
never drop. Every node in the cluster is using about 90% of the memory all
the time. This triggers the GC a lot (sometimes more than 5 times in a
minute). We are using G1GC instead of the default CMS.

I then decided to stop querying and indexing both. The memory usage still
does not drop and there are still a lot of GCs in a given minute. The
effect of this is visible in the queue we are using to update documents. I
can clearly relate the spikes in the queue with the GC activity i.e.
whenever GC happens, indexing seems to become slow. This is sort of
worrying me.

I am attaching screenshots of BigDesk of two of our nodes in the cluster
(1 master data node and 1 non-master data node).

What could be the reason? How should I debug this issue? Will be happy to
share more details if needed.

Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5k%3Dy9%2BY93jFQUV%2Bp5TM9D-rS-2HF1cOqPPTEatFrwjMPQ%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aA5rbhjKhSWJEVyx-_Rn7BqM2%3D_vvbvrOoq%2B251%3D_A4A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just a note: Bigdesk does not show breakdown of GC between old and new
generation. So currently one needs to be careful when interpreting this
chart. Better chart is coming. There are other tools that will show you the
breakdown of GC (Sematext SPM, Marvel, HQ might as well, you can use
visualVM ...)

Lukáš

On Thu, Jan 30, 2014 at 10:14 AM, Mark Walkom markw@campaignmonitor.comwrote:

Facets use a fair bit of memory and in general ES will cache a much as it
can to speed up query times.

Your graphs look pretty reasonable to me, lots of little GC's rather than
big ones rarely is what you want.
What are you expecting?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 30 January 2014 18:43, Vaidik Kapoor kapoor.vaidik@gmail.com wrote:

Hi Guys,

I have the following situation. I have 3 nodes cluster, each node has 32
GB DDR3 RAM and 2 Octocore Processors (16 cores), with HT. Allocated 50% of
the RAM to Elasticsearch. We have only 1 index with 32 shards and 1 replica
each, making it 64 shards in the cluster. We have a lot of mappings in the
index (more than 3000 ~ no. of users).

We have a lot of updates queries. In fact, we have only update queries
with upserts. We are using routing to put similar data in the same shard.
We see the rate of update queries ranging between 20/sec to 60/sec. This is
going to increase to about 130-140/sec when we go live. Our queries are
mostly filtered queries with a lot of use of term faceting (now we are
using the Aggregations module).

We are using doc_values as much as possible to reduce bringing of field
data in the field data cache.

For the last 2 weeks, I have been noticing that the memory usage would
never drop. Every node in the cluster is using about 90% of the memory all
the time. This triggers the GC a lot (sometimes more than 5 times in a
minute). We are using G1GC instead of the default CMS.

I then decided to stop querying and indexing both. The memory usage still
does not drop and there are still a lot of GCs in a given minute. The
effect of this is visible in the queue we are using to update documents. I
can clearly relate the spikes in the queue with the GC activity i.e.
whenever GC happens, indexing seems to become slow. This is sort of
worrying me.

I am attaching screenshots of BigDesk of two of our nodes in the cluster
(1 master data node and 1 non-master data node).

What could be the reason? How should I debug this issue? Will be happy to
share more details if needed.

Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5k%3Dy9%2BY93jFQUV%2Bp5TM9D-rS-2HF1cOqPPTEatFrwjMPQ%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aA5rbhjKhSWJEVyx-_Rn7BqM2%3D_vvbvrOoq%2B251%3D_A4A%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZ8a4Zvz6MAqARGPj_AtUJ_4NabaPGKnae%3DL9bHoKPxZg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

On 30 January 2014 14:44, Mark Walkom markw@campaignmonitor.com wrote:

Facets use a fair bit of memory and in general ES will cache a much as it
can to speed up query times.

I have no issues with ES using memory for cache. But I don't see the memory
consumption drop at all even when I am doing nothing.

Your graphs look pretty reasonable to me, lots of little GC's rather than
big ones rarely is what you want.
What are you expecting?

Agreed. That was the reason for me to switch to G1GC. But the garbage
collection cycle does not seem to free up much memory, even when I am not
performing any operations on the ES cluster. My main problem is that I am
writing/updating about 30-40 docs per second (mostly scripted updates) and
very often I see spikes in my queue that I am used to writing. I was sort
of able to relate these spikes with GC, although there is no concrete
evidence.

I know that Elasticsearch is capable of a lot more and that's why it
concerns me. I don't think that the load I have is too much for
Elasticsearch to take. And that's why I need to understand what could be
the possible reason for periodic slow indexing/updates.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 30 January 2014 18:43, Vaidik Kapoor kapoor.vaidik@gmail.com wrote:

Hi Guys,

I have the following situation. I have 3 nodes cluster, each node has 32
GB DDR3 RAM and 2 Octocore Processors (16 cores), with HT. Allocated 50% of
the RAM to Elasticsearch. We have only 1 index with 32 shards and 1 replica
each, making it 64 shards in the cluster. We have a lot of mappings in the
index (more than 3000 ~ no. of users).

We have a lot of updates queries. In fact, we have only update queries
with upserts. We are using routing to put similar data in the same shard.
We see the rate of update queries ranging between 20/sec to 60/sec. This is
going to increase to about 130-140/sec when we go live. Our queries are
mostly filtered queries with a lot of use of term faceting (now we are
using the Aggregations module).

We are using doc_values as much as possible to reduce bringing of field
data in the field data cache.

For the last 2 weeks, I have been noticing that the memory usage would
never drop. Every node in the cluster is using about 90% of the memory all
the time. This triggers the GC a lot (sometimes more than 5 times in a
minute). We are using G1GC instead of the default CMS.

I then decided to stop querying and indexing both. The memory usage still
does not drop and there are still a lot of GCs in a given minute. The
effect of this is visible in the queue we are using to update documents. I
can clearly relate the spikes in the queue with the GC activity i.e.
whenever GC happens, indexing seems to become slow. This is sort of
worrying me.

I am attaching screenshots of BigDesk of two of our nodes in the cluster
(1 master data node and 1 non-master data node).

What could be the reason? How should I debug this issue? Will be happy to
share more details if needed.

Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5k%3Dy9%2BY93jFQUV%2Bp5TM9D-rS-2HF1cOqPPTEatFrwjMPQ%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aA5rbhjKhSWJEVyx-_Rn7BqM2%3D_vvbvrOoq%2B251%3D_A4A%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5%3DV2XcCeUgFmphKEnz%2B-5qpqwwWkhai4KJbKfUkW8uyRg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just a note: Bigdesk does not show breakdown of GC between old and new
generation. So currently one needs to be careful when interpreting this
chart. Better chart is coming. There are other tools that will show you the
breakdown of GC (Sematext SPM, Marvel, HQ might as well, you can use
visualVM ...)

Thanks for that Lukas. Awaiting the next version. :slight_smile:

Lukáš

On Thu, Jan 30, 2014 at 10:14 AM, Mark Walkom markw@campaignmonitor.comwrote:

Facets use a fair bit of memory and in general ES will cache a much as it
can to speed up query times.

Your graphs look pretty reasonable to me, lots of little GC's rather than
big ones rarely is what you want.
What are you expecting?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 30 January 2014 18:43, Vaidik Kapoor kapoor.vaidik@gmail.com wrote:

Hi Guys,

I have the following situation. I have 3 nodes cluster, each node has 32
GB DDR3 RAM and 2 Octocore Processors (16 cores), with HT. Allocated 50% of
the RAM to Elasticsearch. We have only 1 index with 32 shards and 1 replica
each, making it 64 shards in the cluster. We have a lot of mappings in the
index (more than 3000 ~ no. of users).

We have a lot of updates queries. In fact, we have only update queries
with upserts. We are using routing to put similar data in the same shard.
We see the rate of update queries ranging between 20/sec to 60/sec. This is
going to increase to about 130-140/sec when we go live. Our queries are
mostly filtered queries with a lot of use of term faceting (now we are
using the Aggregations module).

We are using doc_values as much as possible to reduce bringing of field
data in the field data cache.

For the last 2 weeks, I have been noticing that the memory usage would
never drop. Every node in the cluster is using about 90% of the memory all
the time. This triggers the GC a lot (sometimes more than 5 times in a
minute). We are using G1GC instead of the default CMS.

I then decided to stop querying and indexing both. The memory usage
still does not drop and there are still a lot of GCs in a given minute. The
effect of this is visible in the queue we are using to update documents. I
can clearly relate the spikes in the queue with the GC activity i.e.
whenever GC happens, indexing seems to become slow. This is sort of
worrying me.

I am attaching screenshots of BigDesk of two of our nodes in the cluster
(1 master data node and 1 non-master data node).

What could be the reason? How should I debug this issue? Will be happy
to share more details if needed.

Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5k%3Dy9%2BY93jFQUV%2Bp5TM9D-rS-2HF1cOqPPTEatFrwjMPQ%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624aA5rbhjKhSWJEVyx-_Rn7BqM2%3D_vvbvrOoq%2B251%3D_A4A%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZ8a4Zvz6MAqARGPj_AtUJ_4NabaPGKnae%3DL9bHoKPxZg%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5nkzg37GdfEr-36UY-_pYq52jwSWj_Y0%2B7vYd7fyGr68A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.