ES hangs after some time

Hello,

I have a single node cluster which is running on JDK 7 with G1 GC
configured. ES_HEAP_SIZE is 16G and total RAM is 32G. I'm running queries
on all the indices serially after one another. Each index contains 2
shards, with around 5GB data in each shard.

After some time, the queries get really slow and finally ES hangs without
any indication in the logs. CPU usage is shown in 100-400% range for the ES
process. ES logs can be seen at

Here's the 'Memory' graph:
https://apps.sematext.com/spm/s/SE4TMUtXYe

Any ideas, what might be the problem here? The

Following is the log of time taken for each query and no. of records
fetched.
Index, Date, Records, Time(ms)
interval_89846,Tue Jul 23 15:02:13 IST 2013,136,15998
interval_89847,Tue Jul 23 15:02:24 IST 2013,85,10772
interval_89856,Tue Jul 23 15:02:36 IST 2013,156,12578
interval_89857,Tue Jul 23 15:02:53 IST 2013,172,16787
interval_89858,Tue Jul 23 15:03:05 IST 2013,177,12159
interval_89859,Tue Jul 23 15:03:21 IST 2013,184,15794
interval_89860,Tue Jul 23 15:03:36 IST 2013,177,15070
interval_89861,Tue Jul 23 15:03:46 IST 2013,163,9481
interval_89862,Tue Jul 23 15:04:02 IST 2013,169,16280
interval_89863,Tue Jul 23 15:04:19 IST 2013,163,17147
interval_89864,Tue Jul 23 15:04:35 IST 2013,168,15999
interval_89865,Tue Jul 23 15:04:52 IST 2013,163,17433
interval_89866,Tue Jul 23 15:05:09 IST 2013,183,16752
interval_89867,Tue Jul 23 15:05:27 IST 2013,161,17449
interval_89868,Tue Jul 23 15:05:45 IST 2013,159,18394
interval_89869,Tue Jul 23 15:06:01 IST 2013,165,16189
interval_89870,Tue Jul 23 15:06:15 IST 2013,172,13666
interval_89871,Tue Jul 23 15:06:30 IST 2013,154,14952
interval_89872,Tue Jul 23 15:06:48 IST 2013,192,18600
interval_89873,Tue Jul 23 15:07:05 IST 2013,164,16868
interval_89874,Tue Jul 23 15:07:24 IST 2013,185,18795
interval_89875,Tue Jul 23 15:07:38 IST 2013,145,14194
interval_89876,Tue Jul 23 15:08:06 IST 2013,316,28196
interval_89877,Tue Jul 23 15:08:31 IST 2013,438,24893
interval_89878,Tue Jul 23 15:08:55 IST 2013,330,23150
interval_89879,Tue Jul 23 15:09:13 IST 2013,359,18577
interval_89880,Tue Jul 23 15:09:25 IST 2013,168,12212
interval_89881,Tue Jul 23 15:10:10 IST 2013,174,44907
interval_89882,Tue Jul 23 15:10:25 IST 2013,157,14315
interval_89889,Tue Jul 23 15:10:34 IST 2013,126,9759
interval_89890,Tue Jul 23 15:10:49 IST 2013,142,14245
interval_89891,Tue Jul 23 15:11:00 IST 2013,98,11146
ainterval_89846,Tue Jul 23 15:11:37 IST 2013,136,37691
ainterval_89847,Tue Jul 23 15:11:48 IST 2013,85,10835
ainterval_89856,Tue Jul 23 15:12:20 IST 2013,156,32181
ainterval_89857,Tue Jul 23 15:12:59 IST 2013,172,38443
ainterval_89858,Tue Jul 23 15:13:31 IST 2013,177,32252
ainterval_89859,Tue Jul 23 15:14:17 IST 2013,184,45576
ainterval_89860,Tue Jul 23 15:14:46 IST 2013,177,29345
ainterval_89861,Tue Jul 23 15:15:47 IST 2013,163,61020
ainterval_89862,Tue Jul 23 15:17:54 IST 2013,169,127129
ainterval_89863,Tue Jul 23 15:36:10 IST 2013,163,1096058

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Added gist for hot threads

On Tuesday, 23 July 2013 16:02:41 UTC+5:30, Anand Nalya wrote:

Hello,

I have a single node cluster which is running on JDK 7 with G1 GC
configured. ES_HEAP_SIZE is 16G and total RAM is 32G. I'm running queries
on all the indices serially after one another. Each index contains 2
shards, with around 5GB data in each shard.

After some time, the queries get really slow and finally ES hangs without
any indication in the logs. CPU usage is shown in 100-400% range for the ES
process. ES logs can be seen at
https://gist.github.com/anandnalya/6061423#file-es-g1-log

Here's the 'Memory' graph:
https://apps.sematext.com/spm/s/SE4TMUtXYe

Any ideas, what might be the problem here? The

Following is the log of time taken for each query and no. of records
fetched.
Index, Date, Records, Time(ms)
interval_89846,Tue Jul 23 15:02:13 IST 2013,136,15998
interval_89847,Tue Jul 23 15:02:24 IST 2013,85,10772
interval_89856,Tue Jul 23 15:02:36 IST 2013,156,12578
interval_89857,Tue Jul 23 15:02:53 IST 2013,172,16787
interval_89858,Tue Jul 23 15:03:05 IST 2013,177,12159
interval_89859,Tue Jul 23 15:03:21 IST 2013,184,15794
interval_89860,Tue Jul 23 15:03:36 IST 2013,177,15070
interval_89861,Tue Jul 23 15:03:46 IST 2013,163,9481
interval_89862,Tue Jul 23 15:04:02 IST 2013,169,16280
interval_89863,Tue Jul 23 15:04:19 IST 2013,163,17147
interval_89864,Tue Jul 23 15:04:35 IST 2013,168,15999
interval_89865,Tue Jul 23 15:04:52 IST 2013,163,17433
interval_89866,Tue Jul 23 15:05:09 IST 2013,183,16752
interval_89867,Tue Jul 23 15:05:27 IST 2013,161,17449
interval_89868,Tue Jul 23 15:05:45 IST 2013,159,18394
interval_89869,Tue Jul 23 15:06:01 IST 2013,165,16189
interval_89870,Tue Jul 23 15:06:15 IST 2013,172,13666
interval_89871,Tue Jul 23 15:06:30 IST 2013,154,14952
interval_89872,Tue Jul 23 15:06:48 IST 2013,192,18600
interval_89873,Tue Jul 23 15:07:05 IST 2013,164,16868
interval_89874,Tue Jul 23 15:07:24 IST 2013,185,18795
interval_89875,Tue Jul 23 15:07:38 IST 2013,145,14194
interval_89876,Tue Jul 23 15:08:06 IST 2013,316,28196
interval_89877,Tue Jul 23 15:08:31 IST 2013,438,24893
interval_89878,Tue Jul 23 15:08:55 IST 2013,330,23150
interval_89879,Tue Jul 23 15:09:13 IST 2013,359,18577
interval_89880,Tue Jul 23 15:09:25 IST 2013,168,12212
interval_89881,Tue Jul 23 15:10:10 IST 2013,174,44907
interval_89882,Tue Jul 23 15:10:25 IST 2013,157,14315
interval_89889,Tue Jul 23 15:10:34 IST 2013,126,9759
interval_89890,Tue Jul 23 15:10:49 IST 2013,142,14245
interval_89891,Tue Jul 23 15:11:00 IST 2013,98,11146
ainterval_89846,Tue Jul 23 15:11:37 IST 2013,136,37691
ainterval_89847,Tue Jul 23 15:11:48 IST 2013,85,10835
ainterval_89856,Tue Jul 23 15:12:20 IST 2013,156,32181
ainterval_89857,Tue Jul 23 15:12:59 IST 2013,172,38443
ainterval_89858,Tue Jul 23 15:13:31 IST 2013,177,32252
ainterval_89859,Tue Jul 23 15:14:17 IST 2013,184,45576
ainterval_89860,Tue Jul 23 15:14:46 IST 2013,177,29345
ainterval_89861,Tue Jul 23 15:15:47 IST 2013,163,61020
ainterval_89862,Tue Jul 23 15:17:54 IST 2013,169,127129
ainterval_89863,Tue Jul 23 15:36:10 IST 2013,163,1096058

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Can you give an example what query you use?

I see only you use the HasChildFilter, which is massively working against
the G1 garbage collector.

Jörg

On Tue, Jul 23, 2013 at 12:42 PM, Anand Nalya anand.nalya@gmail.com wrote:

Added gist for hot threads
https://gist.github.com/anandnalya/6061423#file-hotthreads

On Tuesday, 23 July 2013 16:02:41 UTC+5:30, Anand Nalya wrote:

Hello,

I have a single node cluster which is running on JDK 7 with G1 GC
configured. ES_HEAP_SIZE is 16G and total RAM is 32G. I'm running queries
on all the indices serially after one another. Each index contains 2
shards, with around 5GB data in each shard.

After some time, the queries get really slow and finally ES hangs without
any indication in the logs. CPU usage is shown in 100-400% range for the ES
process. ES logs can be seen at https://gist.github.com/**
anandnalya/6061423#file-es-g1-**loghttps://gist.github.com/anandnalya/6061423#file-es-g1-log

Here's the 'Memory' graph:
https://apps.sematext.com/spm/**s/SE4TMUtXYehttps://apps.sematext.com/spm/s/SE4TMUtXYe

Any ideas, what might be the problem here? The

Following is the log of time taken for each query and no. of records
fetched.
Index, Date, Records, Time(ms)
interval_89846,Tue Jul 23 15:02:13 IST 2013,136,15998
interval_89847,Tue Jul 23 15:02:24 IST 2013,85,10772
interval_89856,Tue Jul 23 15:02:36 IST 2013,156,12578
interval_89857,Tue Jul 23 15:02:53 IST 2013,172,16787
interval_89858,Tue Jul 23 15:03:05 IST 2013,177,12159
interval_89859,Tue Jul 23 15:03:21 IST 2013,184,15794
interval_89860,Tue Jul 23 15:03:36 IST 2013,177,15070
interval_89861,Tue Jul 23 15:03:46 IST 2013,163,9481
interval_89862,Tue Jul 23 15:04:02 IST 2013,169,16280
interval_89863,Tue Jul 23 15:04:19 IST 2013,163,17147
interval_89864,Tue Jul 23 15:04:35 IST 2013,168,15999
interval_89865,Tue Jul 23 15:04:52 IST 2013,163,17433
interval_89866,Tue Jul 23 15:05:09 IST 2013,183,16752
interval_89867,Tue Jul 23 15:05:27 IST 2013,161,17449
interval_89868,Tue Jul 23 15:05:45 IST 2013,159,18394
interval_89869,Tue Jul 23 15:06:01 IST 2013,165,16189
interval_89870,Tue Jul 23 15:06:15 IST 2013,172,13666
interval_89871,Tue Jul 23 15:06:30 IST 2013,154,14952
interval_89872,Tue Jul 23 15:06:48 IST 2013,192,18600
interval_89873,Tue Jul 23 15:07:05 IST 2013,164,16868
interval_89874,Tue Jul 23 15:07:24 IST 2013,185,18795
interval_89875,Tue Jul 23 15:07:38 IST 2013,145,14194
interval_89876,Tue Jul 23 15:08:06 IST 2013,316,28196
interval_89877,Tue Jul 23 15:08:31 IST 2013,438,24893
interval_89878,Tue Jul 23 15:08:55 IST 2013,330,23150
interval_89879,Tue Jul 23 15:09:13 IST 2013,359,18577
interval_89880,Tue Jul 23 15:09:25 IST 2013,168,12212
interval_89881,Tue Jul 23 15:10:10 IST 2013,174,44907
interval_89882,Tue Jul 23 15:10:25 IST 2013,157,14315
interval_89889,Tue Jul 23 15:10:34 IST 2013,126,9759
interval_89890,Tue Jul 23 15:10:49 IST 2013,142,14245
interval_89891,Tue Jul 23 15:11:00 IST 2013,98,11146
ainterval_89846,Tue Jul 23 15:11:37 IST 2013,136,37691
ainterval_89847,Tue Jul 23 15:11:48 IST 2013,85,10835
ainterval_89856,Tue Jul 23 15:12:20 IST 2013,156,32181
ainterval_89857,Tue Jul 23 15:12:59 IST 2013,172,38443
ainterval_89858,Tue Jul 23 15:13:31 IST 2013,177,32252
ainterval_89859,Tue Jul 23 15:14:17 IST 2013,184,45576
ainterval_89860,Tue Jul 23 15:14:46 IST 2013,177,29345
ainterval_89861,Tue Jul 23 15:15:47 IST 2013,163,61020
ainterval_89862,Tue Jul 23 15:17:54 IST 2013,169,127129
ainterval_89863,Tue Jul 23 15:36:10 IST 2013,163,1096058

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

The query is:

{
"filtered" : {
"query" : {
"range" : {
"c100" : {
"from" : "0",
"to" : "5293983999",
"include_lower" :true,
"include_upper" : true
}
}
},
"filter" : {
"wrapper" : {
"filter" : {
"and":[{
"or":[{
"query" : {
"query_string" : {
"query" : "c76:"p2p"",
"default_operator" : "and"
}
}
},
{
"query" : {
"query_string" : {
"query" : "c76:"http"",
"default_operator" : "and"
}
}
}]
},
{
"has_child" : {
"query" : {
"query_string" : {
"query" : "microsoft",
"fields" : [ "content" ]
}
},
"child_type":"mlivemass_content"
}
},
{
"query" : {
"query_string" : {
"query" : "c150:"29"",
"default_operator" : "and"
}
}
}]
}
}
}
}
}

On Tuesday, 23 July 2013 22:02:23 UTC+5:30, Jörg Prante wrote:

Can you give an example what query you use?

I see only you use the HasChildFilter, which is massively working against
the G1 garbage collector.

Jörg

On Tue, Jul 23, 2013 at 12:42 PM, Anand Nalya <anand...@gmail.com<javascript:>

wrote:

Added gist for hot threads
https://gist.github.com/anandnalya/6061423#file-hotthreads

On Tuesday, 23 July 2013 16:02:41 UTC+5:30, Anand Nalya wrote:

Hello,

I have a single node cluster which is running on JDK 7 with G1 GC
configured. ES_HEAP_SIZE is 16G and total RAM is 32G. I'm running queries
on all the indices serially after one another. Each index contains 2
shards, with around 5GB data in each shard.

After some time, the queries get really slow and finally ES hangs
without any indication in the logs. CPU usage is shown in 100-400% range
for the ES process. ES logs can be seen at https://gist.github.com/**
anandnalya/6061423#file-es-g1-**loghttps://gist.github.com/anandnalya/6061423#file-es-g1-log

Here's the 'Memory' graph:
https://apps.sematext.com/spm/**s/SE4TMUtXYehttps://apps.sematext.com/spm/s/SE4TMUtXYe

Any ideas, what might be the problem here? The

Following is the log of time taken for each query and no. of records
fetched.
Index, Date, Records, Time(ms)
interval_89846,Tue Jul 23 15:02:13 IST 2013,136,15998
interval_89847,Tue Jul 23 15:02:24 IST 2013,85,10772
interval_89856,Tue Jul 23 15:02:36 IST 2013,156,12578
interval_89857,Tue Jul 23 15:02:53 IST 2013,172,16787
interval_89858,Tue Jul 23 15:03:05 IST 2013,177,12159
interval_89859,Tue Jul 23 15:03:21 IST 2013,184,15794
interval_89860,Tue Jul 23 15:03:36 IST 2013,177,15070
interval_89861,Tue Jul 23 15:03:46 IST 2013,163,9481
interval_89862,Tue Jul 23 15:04:02 IST 2013,169,16280
interval_89863,Tue Jul 23 15:04:19 IST 2013,163,17147
interval_89864,Tue Jul 23 15:04:35 IST 2013,168,15999
interval_89865,Tue Jul 23 15:04:52 IST 2013,163,17433
interval_89866,Tue Jul 23 15:05:09 IST 2013,183,16752
interval_89867,Tue Jul 23 15:05:27 IST 2013,161,17449
interval_89868,Tue Jul 23 15:05:45 IST 2013,159,18394
interval_89869,Tue Jul 23 15:06:01 IST 2013,165,16189
interval_89870,Tue Jul 23 15:06:15 IST 2013,172,13666
interval_89871,Tue Jul 23 15:06:30 IST 2013,154,14952
interval_89872,Tue Jul 23 15:06:48 IST 2013,192,18600
interval_89873,Tue Jul 23 15:07:05 IST 2013,164,16868
interval_89874,Tue Jul 23 15:07:24 IST 2013,185,18795
interval_89875,Tue Jul 23 15:07:38 IST 2013,145,14194
interval_89876,Tue Jul 23 15:08:06 IST 2013,316,28196
interval_89877,Tue Jul 23 15:08:31 IST 2013,438,24893
interval_89878,Tue Jul 23 15:08:55 IST 2013,330,23150
interval_89879,Tue Jul 23 15:09:13 IST 2013,359,18577
interval_89880,Tue Jul 23 15:09:25 IST 2013,168,12212
interval_89881,Tue Jul 23 15:10:10 IST 2013,174,44907
interval_89882,Tue Jul 23 15:10:25 IST 2013,157,14315
interval_89889,Tue Jul 23 15:10:34 IST 2013,126,9759
interval_89890,Tue Jul 23 15:10:49 IST 2013,142,14245
interval_89891,Tue Jul 23 15:11:00 IST 2013,98,11146
ainterval_89846,Tue Jul 23 15:11:37 IST 2013,136,37691
ainterval_89847,Tue Jul 23 15:11:48 IST 2013,85,10835
ainterval_89856,Tue Jul 23 15:12:20 IST 2013,156,32181
ainterval_89857,Tue Jul 23 15:12:59 IST 2013,172,38443
ainterval_89858,Tue Jul 23 15:13:31 IST 2013,177,32252
ainterval_89859,Tue Jul 23 15:14:17 IST 2013,184,45576
ainterval_89860,Tue Jul 23 15:14:46 IST 2013,177,29345
ainterval_89861,Tue Jul 23 15:15:47 IST 2013,163,61020
ainterval_89862,Tue Jul 23 15:17:54 IST 2013,169,127129
ainterval_89863,Tue Jul 23 15:36:10 IST 2013,163,1096058

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.