Before firing queries, you should consider if the index design and query
choice is optimal.
Numeric range queries are not straightforward. They were a major issue on
inverted index engines like Lucene/Elasticsearch and it has taken some time
to introduce efficient implementations. See e.g.
https://issues.apache.org/jira/browse/LUCENE-1673
ES tries to compensate the downsides of massive numeric range queries by
loading all the field values into memory. To achieve effective queries, you
have to carefully discretize the values you index.
For example, a few hundred millions of different timestamps, with
millisecond resolution, are a real burden for searching on inverted
indices. A good discretization strategy for indexing is to reduce the total
amount of values in such field to a few hundred or thousands. For
timestamps, this means, indexing time-based series data in discrete
intervals of days, hours, minutes, maybe seconds is much more efficient
than e.g. millisecond resolution.
Another topic is to use filters for boolean queries. They are much faster.
Jörg
On Sat, Aug 23, 2014 at 2:19 PM, Narendra Yadala narendra.yadala@gmail.com
wrote:
Hi Ivan,
Thanks for the input about aggregating on strings, I do that, but those
queries take time but they do not crash node.
The queries which caused problem were pretty straightforward queries (such
as a boolean query with two musts, one must is equal match and other a
range match on long) but the real problem was with the size. When I kept
size as Integer.MAX_VALUE, it caused all the problems. When I removed it,
it started working fine. I think it is worth mentioning somewhere about
this strange behavior (probably expected but strange).
I did double up on the RAM though and now I have allocated 5*10G RAM to
the cluster. Things are looking ok as of now, except that the aggregations
(on strings) are quite slow. May be I would run these aggregations as batch
and cache the outputs in a different type and move on for now.
Thanks
NY
On Fri, Aug 22, 2014 at 10:34 PM, Ivan Brusic ivan@brusic.com wrote:
How expensive are your queries? Are you using aggregations or sorting on
string fields that could use up your field data cache? Are you using the
defaults for the cache? Post the current usage.
If you post an example query and mapping, perhaps the community can help
optimize it.
Cheers,
Ivan
On Fri, Aug 22, 2014 at 12:28 AM, Narendra Yadala <
narendra.yadala@gmail.com> wrote:
I have a cluster of size 240 GB including replica and it has 5 nodes
in it. I allocated 5 GB RAM (total 5*5 GB) to each node and started the
cluster. When I start continuously firing queries on the cluster the GC
starts kicking in and eventually node goes down because of OutOfMemory
exception. I add upto 200k documents everyday. The indexing part works fine
but querying part is causing trouble. I have the cluster on ec2 and I use
ec2 discovery mode.
What is ideal RAM size and are there any other parameters I need to tune
to get this cluster going?
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5b659d11-d757-4f8e-b347-60b3807c2dfe%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5b659d11-d757-4f8e-b347-60b3807c2dfe%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/DdPD8MiquYQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDQ9GTt%3Dcf1s1sXy57UMNB-0MNgNgCWEQOLooXDX7yNUA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDQ9GTt%3Dcf1s1sXy57UMNB-0MNgNgCWEQOLooXDX7yNUA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAOpeyMHfTmW06iSrximhD2F%2BxdeV2KhRy6AppO_JrcMgwXy2MA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAOpeyMHfTmW06iSrximhD2F%2BxdeV2KhRy6AppO_JrcMgwXy2MA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFdY3-Kyhy5kenK16Bbv5tSu36mJFd1ULKkhNE4feh0Hg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.