PerThreadIDAndVersionLookup - thread safety

Hi,
Can anyone shortly describe why class PerThreadIDAndVersionLookup is not
thread safe and what is needed to make it thread safe? I'm wondering if it
is possible to keep only single instance of VersionLookup and make it not
stick to a thread. I see waste of big chunk of memory in JVM only because
of class PerThreadIDAndVersionLookup.

Thanks a lot for any suggestions/advises.

--
Paweł

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

It is not thread safe because of the TermsEnum array, which can not be
shared between threads. By not sharing, a thread can reuse the array, which
avoids expensive reinitialization.

The utility class was introduced at

and from what I understand this replaced the previous version ID lookup by
bloom filters (which comes with a very noticeable RAM cost)

Maybe you have lots of segments?

Sometimes, ThreadLocals go crazy because of Java issues, and they are hard
to clean up. So I think if you can post some more detailed information
about what you have seen and what OS, JVM, and ES versions you use, it
would be helpful.

Jörg

On Sun, Mar 15, 2015 at 10:16 PM, Paweł Róg prog88@gmail.com wrote:

Hi,
Can anyone shortly describe why class PerThreadIDAndVersionLookup is not
thread safe and what is needed to make it thread safe? I'm wondering if it
is possible to keep only single instance of VersionLookup and make it not
stick to a thread. I see waste of big chunk of memory in JVM only because
of class PerThreadIDAndVersionLookup.

Thanks a lot for any suggestions/advises.

--
Paweł

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFaLjw58opRVQ79_fXxtzRDRoFDZz2y2qRGz1bVbAX6jg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
If only TermsEnum array is not thread safe, maybe it's worth to consider
sticking only it (or all other required arrays) to a thread?
Heap dump report showed me that a lot of space is taken by versions
NumericDocValues array.

--
Paweł

W dniu poniedziałek, 16 marca 2015 00:03:59 UTC+1 użytkownik Jörg Prante
napisał:

It is not thread safe because of the TermsEnum array, which can not be
shared between threads. By not sharing, a thread can reuse the array, which
avoids expensive reinitialization.

The utility class was introduced at

Indexing: Versions.loadDocIdAndVersion should reuse TermsEnums · Issue #6212 · elastic/elasticsearch · GitHub

and from what I understand this replaced the previous version ID lookup by
bloom filters (which comes with a very noticeable RAM cost)

Maybe you have lots of segments?

Sometimes, ThreadLocals go crazy because of Java issues, and they are hard
to clean up. So I think if you can post some more detailed information
about what you have seen and what OS, JVM, and ES versions you use, it
would be helpful.

Jörg

On Sun, Mar 15, 2015 at 10:16 PM, Paweł Róg <pro...@gmail.com
<javascript:>> wrote:

Hi,
Can anyone shortly describe why class PerThreadIDAndVersionLookup is not
thread safe and what is needed to make it thread safe? I'm wondering if it
is possible to keep only single instance of VersionLookup and make it not
stick to a thread. I see waste of big chunk of memory in JVM only because
of class PerThreadIDAndVersionLookup.

Thanks a lot for any suggestions/advises.

--
Paweł

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9af61116-2abb-4995-aa0c-86a39b046337%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sorry for being unclear, the TermsEnum array is one (the most important) of
the arrays for iteration, the other arrays are also not thread safe - you
can view all the private class variables as a thread-private cache.
NumericDocValues is the key component for retrieving the version.

Jörg

On Mon, Mar 16, 2015 at 2:29 PM, Paweł Róg prog88@gmail.com wrote:

Hi,
If only TermsEnum array is not thread safe, maybe it's worth to consider
sticking only it (or all other required arrays) to a thread?
Heap dump report showed me that a lot of space is taken by versions
NumericDocValues array.

--
Paweł

W dniu poniedziałek, 16 marca 2015 00:03:59 UTC+1 użytkownik Jörg Prante
napisał:

It is not thread safe because of the TermsEnum array, which can not be
shared between threads. By not sharing, a thread can reuse the array, which
avoids expensive reinitialization.

The utility class was introduced at

Indexing: Versions.loadDocIdAndVersion should reuse TermsEnums · Issue #6212 · elastic/elasticsearch · GitHub

and from what I understand this replaced the previous version ID lookup
by bloom filters (which comes with a very noticeable RAM cost)

Maybe you have lots of segments?

Sometimes, ThreadLocals go crazy because of Java issues, and they are
hard to clean up. So I think if you can post some more detailed information
about what you have seen and what OS, JVM, and ES versions you use, it
would be helpful.

Jörg

On Sun, Mar 15, 2015 at 10:16 PM, Paweł Róg pro...@gmail.com wrote:

Hi,
Can anyone shortly describe why class PerThreadIDAndVersionLookup is
not thread safe and what is needed to make it thread safe? I'm wondering if
it is possible to keep only single instance of VersionLookup and make it
not stick to a thread. I see waste of big chunk of memory in JVM only
because of class PerThreadIDAndVersionLookup.

Thanks a lot for any suggestions/advises.

--
Paweł

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_
oPJcs4LqyYA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHngsdi_u_gj0PAaahB%2B8fEhsqRQ0SNr5LrFw5_oPJcs4LqyYA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9af61116-2abb-4995-aa0c-86a39b046337%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9af61116-2abb-4995-aa0c-86a39b046337%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG41vXVG5q-ayQtSGVuj8zhkUe%3DXbavRxkH%2BQYP0HD%3Dig%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.