DocValues overhead in ES

Hi,
I analyzed heap dump taken from Elasticsearch and I can see a lot of space
in heap is occupied by structures and references related to doc values. I
can see tons of hash maps with weak references pointing to objects
representing some values in DV. I was wondering if this is somehow cached
on ES side or it is totally Lucene internal mechanism. Can we influence the
size/number of instances of objects connected to field data?

I'd be glad if someone can explain it to me.

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

If you have a lot of unique values and you ask for aggregations looking for
unique values amongst though, then what you are seeing can happen.

On 26 March 2015 at 03:05, Paweł Róg prog88@gmail.com wrote:

Hi,
I analyzed heap dump taken from Elasticsearch and I can see a lot of space
in heap is occupied by structures and references related to doc values. I
can see tons of hash maps with weak references pointing to objects
representing some values in DV. I was wondering if this is somehow cached
on ES side or it is totally Lucene internal mechanism. Can we influence the
size/number of instances of objects connected to field data?

I'd be glad if someone can explain it to me.

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_EO8N%2B1f%3D7spiQzXYp-71OhPcmT3n09pei-52cdvANrg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
Thank you for your response. Unfortunately I think we misunderstood. I was
NOT asking if described case can happen because I see it can :slight_smile: I was
rather asking about ES internals and if there is any way to optimize such a
case (including source code modifications).

--
Paweł Róg

On Thursday, March 26, 2015 at 3:31:51 AM UTC+1, Mark Walkom wrote:

If you have a lot of unique values and you ask for aggregations looking
for unique values amongst though, then what you are seeing can happen.

On 26 March 2015 at 03:05, Paweł Róg <pro...@gmail.com <javascript:>>
wrote:

Hi,
I analyzed heap dump taken from Elasticsearch and I can see a lot of
space in heap is occupied by structures and references related to doc
values. I can see tons of hash maps with weak references pointing to
objects representing some values in DV. I was wondering if this is somehow
cached on ES side or it is totally Lucene internal mechanism. Can we
influence the size/number of instances of objects connected to field data?

I'd be glad if someone can explain it to me.

--
Paweł Róg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1bb6b0b3-c261-4f37-b0e0-3bf537595e70%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.