This query brought down my cluster :{"query": {"function_score": {"query": {"match_all":{}}, "random_score": {"seed": 123456}}}}

Jinyuan_Zhou · December 23, 2014, 3:28am

Hi,
I like to share my experience and in the same time hope I can get some
tips.

The query was run against an index with about 700 million documents.
Two things happens,

The node run this query crashed. It is the node configured not to
proccess data.
The data nodes start crazy on GC. eventually old generation gc cannot
reduce the heep usage and the nodes becomes unresponsive. in some cases.
OLD generation gc even increased size of the heap:

2014-12-20 07:21:03,370][WARN ][monitor.jvm ] [****]
[gc][young][2796041][224976] duration [1.1s], collections [1]/[1.3s], total
[1.1s]/[3.4h], memory [21.5gb]->[21.2gb]/[29.8gb], all_pools {[young]
[1.4gb]->[3.4mb]/[1.4gb]}{[survivor]
[191.3mb]->[191.3mb]/[191.3mb]}{[old] [19.9gb]->[21gb]/[28.1gb]}

It is a bad query by itself. But I expected ES cluster handles it
gracefully. It does throw this exception:

Caused by: org.elasticsearch.common.breaker.CircuitBreakingException:
[FIELDDATA] Data too large, data for [_uid] would be larger than limit of
[19206989414/17.8gb]*
I guess ES stopped at some point because field data exceeds the default
limit. But it is too late to stop the query that caused heap memory issue.
I am wondering if there is any obvious wrong with my ES cluster
configuration.
I have 5 box eah with 125 ram and 32 cores. I deploy two data nodes on each
of them the heap fixed at 31G and configuration is favor bulk ingesting. I
actually saw above 60+K document ingesting through put per second. It was
working fine until that query comes.

Thanks,

Jack

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ae1b7ea6-d801-4d67-b047-69ab54f1f38b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jinyuan_Zhou · December 23, 2014, 7:30pm

OK, I think I miss read old generation gc issue. the line I quoted shows
the old generation region mem size after a young gc. it should be expected
after a young generation gc because some surviving objects are promoted
from young to survival and from survivor to old.

Jinyuan (Jack) Zhou

On Mon, Dec 22, 2014 at 7:28 PM, Jinyuan Zhou zhou.jinyuan@gmail.com
wrote:

Hi,
I like to share my experience and in the same time hope I can get some
tips.

The query was run against an index with about 700 million documents.
Two things happens,

The node run this query crashed. It is the node configured not to
proccess data.

The data nodes start crazy on GC. eventually old generation gc cannot
reduce the heep usage and the nodes becomes unresponsive. in some cases.
OLD generation gc even increased size of the heap:

2014-12-20 07:21:03,370][WARN ][monitor.jvm ] [****]
[gc][young][2796041][224976] duration [1.1s], collections [1]/[1.3s], total
[1.1s]/[3.4h], memory [21.5gb]->[21.2gb]/[29.8gb], all_pools {[young]
[1.4gb]->[3.4mb]/[1.4gb]}{[survivor]
[191.3mb]->[191.3mb]/[191.3mb]}{[old] [19.9gb]->[21gb]/[28.1gb]}

It is a bad query by itself. But I expected ES cluster handles it
gracefully. It does throw this exception:

Caused by: org.elasticsearch.common.breaker.CircuitBreakingException:
[FIELDDATA] Data too large, data for [_uid] would be larger than limit of
[19206989414 <%5B19206989414>/17.8gb]*
I guess ES stopped at some point because field data exceeds the default
limit. But it is too late to stop the query that caused heap memory issue.
I am wondering if there is any obvious wrong with my ES cluster
configuration.
I have 5 box eah with 125 ram and 32 cores. I deploy two data nodes on
each of them the heap fixed at 31G and configuration is favor bulk
ingesting. I actually saw above 60+K document ingesting through put per
second. It was working fine until that query comes.

Thanks,

Jack

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/k2RkmjuO5OI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ae1b7ea6-d801-4d67-b047-69ab54f1f38b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ae1b7ea6-d801-4d67-b047-69ab54f1f38b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCHAn78f%2BEbZ9R_6sf5jMKYBL%3DDAZeAL2Lg_JZV1L3peWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Random_score query brought down ES 5.5 cluster Elasticsearch	2	358	May 21, 2019
ElasticSearch cluster down due to high memory usage Elasticsearch	2	610	August 7, 2023
A memory intensive query crashes an elasticsearch node Elasticsearch	1	1058	July 5, 2017
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [indices:data/write/bulk[s][r]] Elasticsearch	14	7664	August 3, 2021
GC failing to reduce heap memory usage Elasticsearch	10	770	July 6, 2017

This query brought down my cluster :{"query": {"function_score": {"query": {"match_all":{}}, "random_score": {"seed": 123456}}}}

Related topics