Term filter length on performance and Prefix filter


(arthurx) #1

How is the Prefix filter compared to Term filter performance-wise?

For example,
matching "o1382334"
with prefix filter of "o" only need to check the first letter, whereas
using term filter need to match "o1382334" eight letters. Assume they match
equal number of documents, it seems that prefix filter is more
"lightweight".

Would someone having an understanding of the underlying algorithm of how
term filter works comment on this?

A related problem is: is it good for performance to use very long string as
term filter, like a md5 hash instead of a simple integer ID?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc2f2637-42fd-49b3-9ef9-2560f87a9cc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #2

Thanks to the inverted index, terms are only looked up once per segment. So
I don't think the number of characters to compare would have any
performance impact.

One benefit of the term filter though might be that the terms dictionary
index can know that it is not contained in the terms dictionary without
going to disk.

On Tue, Apr 29, 2014 at 6:30 AM, arthurX fc28222@gmail.com wrote:

How is the Prefix filter compared to Term filter performance-wise?

For example,
matching "o1382334"
with prefix filter of "o" only need to check the first letter, whereas
using term filter need to match "o1382334" eight letters. Assume they
match equal number of documents, it seems that prefix filter is more
"lightweight".

Would someone having an understanding of the underlying algorithm of how
term filter works comment on this?

A related problem is: is it good for performance to use very long string
as term filter, like a md5 hash instead of a simple integer ID?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc2f2637-42fd-49b3-9ef9-2560f87a9cc2%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/cc2f2637-42fd-49b3-9ef9-2560f87a9cc2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7PVkFfpoTHMRNXq2YgnE%3D%2B32kEbpgp4mOToYAY6%3DQE7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3