Slow search perfomance when using mmap versus memory


(MikeP-2) #1

Our servers have 130 GB of RAM and we are giving ElasticSearch 30GB of
heap. Each machine contains one shard that is about 3GB with about 1.3
million documents. When we search using a match query and the "memory"
store type, the queries are about twice as fast as when we use "mmap (with
mlockall enabled)". Are there any tools to to troubleshoot the latency?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

Can you share your setup configuration, and an example document and a
query? So it is possible to recreate your situation?

Also interesting would be OS version, ES version, Java JVM version.

Thanks,

Jörg

On Wed, Jun 11, 2014 at 6:44 PM, MikeP michaelp79@gmail.com wrote:

Our servers have 130 GB of RAM and we are giving ElasticSearch 30GB of
heap. Each machine contains one shard that is about 3GB with about 1.3
million documents. When we search using a match query and the "memory"
store type, the queries are about twice as fast as when we use "mmap (with
mlockall enabled)". Are there any tools to to troubleshoot the latency?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFgyB3M1121BOSaKxefTT_XK1sgZLLKxj-GSKZSedCVKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #3

N.B. of course, also the response times you measured and how you measuring
them are of interest.

In my tests with ES 1.2.1 and JDK 8u5 on Mac OS X, with a test index of 1.5
mio docs, I see search times of around 1-5ms on mmapfs for random match
queries (except the first query after the node started). Index store
"memory" is not faster.

Jörg

On Wed, Jun 11, 2014 at 11:09 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Can you share your setup configuration, and an example document and a
query? So it is possible to recreate your situation?

Also interesting would be OS version, ES version, Java JVM version.

Thanks,

Jörg

On Wed, Jun 11, 2014 at 6:44 PM, MikeP michaelp79@gmail.com wrote:

Our servers have 130 GB of RAM and we are giving ElasticSearch 30GB of
heap. Each machine contains one shard that is about 3GB with about 1.3
million documents. When we search using a match query and the "memory"
store type, the queries are about twice as fast as when we use "mmap (with
mlockall enabled)". Are there any tools to to troubleshoot the latency?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHQhi1V0oWi7dvV_1sj_uDVBqFcG05uPS8RGaNdw%2Bgu5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(MikeP-2) #4

Hi Jörg,

I'm using ES 1.1.1, JDK7u55, RHEL5. Our documents contains only integers
between 0-1,000,000. Our query is a bit unusual as it is basically a match
query that only contains ORs and there could be up to 1,000 clauses. For
example:

{"query": {
"match":{
"num_field":{
"query":"1 2 3 4 5 99998 99999","operator":"or"}
}
}
}

When we search using less than 100 clauses, the search is pretty quick,
typically less 30ms on both "memory" and "mmap". When there are 1,000
clauses, the average is about 200ms using "memory" and about 400ms using
"mmap".

Thanks for the help.

On Wednesday, June 11, 2014 2:25:14 PM UTC-7, Jörg Prante wrote:

N.B. of course, also the response times you measured and how you measuring
them are of interest.

In my tests with ES 1.2.1 and JDK 8u5 on Mac OS X, with a test index of
1.5 mio docs, I see search times of around 1-5ms on mmapfs for random match
queries (except the first query after the node started). Index store
"memory" is not faster.

Jörg

On Wed, Jun 11, 2014 at 11:09 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

Can you share your setup configuration, and an example document and a
query? So it is possible to recreate your situation?

Also interesting would be OS version, ES version, Java JVM version.

Thanks,

Jörg

On Wed, Jun 11, 2014 at 6:44 PM, MikeP <micha...@gmail.com <javascript:>>
wrote:

Our servers have 130 GB of RAM and we are giving ElasticSearch 30GB of
heap. Each machine contains one shard that is about 3GB with about 1.3
million documents. When we search using a match query and the "memory"
store type, the queries are about twice as fast as when we use "mmap (with
mlockall enabled)". Are there any tools to to troubleshoot the latency?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1b324e89-6d91-453d-981a-94f411171e4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #5

You should use a boolean query and wrap it into a constant core query.
Constant score query is important, otherwise each clause will lead to score
calculation which has a significant impact on the overall search response
time.

There is also a notable difference of performance on AWS between "memory"
and "mmapfs" because AWS VMs are slow - this is another story, not related
to queries.

Jörg

On Thu, Jun 12, 2014 at 1:19 AM, MikeP michaelp79@gmail.com wrote:

Hi Jörg,

I'm using ES 1.1.1, JDK7u55, RHEL5. Our documents contains only integers
between 0-1,000,000. Our query is a bit unusual as it is basically a match
query that only contains ORs and there could be up to 1,000 clauses. For
example:

{"query": {
"match":{
"num_field":{
"query":"1 2 3 4 5 99998 99999","operator":"or"}
}
}
}

When we search using less than 100 clauses, the search is pretty quick,
typically less 30ms on both "memory" and "mmap". When there are 1,000
clauses, the average is about 200ms using "memory" and about 400ms using
"mmap".

Thanks for the help.

On Wednesday, June 11, 2014 2:25:14 PM UTC-7, Jörg Prante wrote:

N.B. of course, also the response times you measured and how you
measuring them are of interest.

In my tests with ES 1.2.1 and JDK 8u5 on Mac OS X, with a test index of
1.5 mio docs, I see search times of around 1-5ms on mmapfs for random match
queries (except the first query after the node started). Index store
"memory" is not faster.

Jörg

On Wed, Jun 11, 2014 at 11:09 PM, joerg...@gmail.com joerg...@gmail.com
wrote:

Can you share your setup configuration, and an example document and a
query? So it is possible to recreate your situation?

Also interesting would be OS version, ES version, Java JVM version.

Thanks,

Jörg

On Wed, Jun 11, 2014 at 6:44 PM, MikeP micha...@gmail.com wrote:

Our servers have 130 GB of RAM and we are giving ElasticSearch 30GB of
heap. Each machine contains one shard that is about 3GB with about 1.3
million documents. When we search using a match query and the "memory"
store type, the queries are about twice as fast as when we use "mmap (with
mlockall enabled)". Are there any tools to to troubleshoot the latency?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/afde39cc-e0c4-49b8-b3ec-a107bf8a4755%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1b324e89-6d91-453d-981a-94f411171e4f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1b324e89-6d91-453d-981a-94f411171e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFCwBrBQczNAaXE-LBQxd1HpO9hdDX6_GY9X-0dt0tAzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(MikeP-2) #6

We actually need the score for each. When we run the same query directly on
Lucene (no ElasticSearch), the response times are fast, typically under
100ms. We're trying to understand why there is such a large difference in
response times when search using ElasticSearch versus Lucene.

On Thursday, June 12, 2014 3:58:35 AM UTC-7, Jörg Prante wrote:

You should use a boolean query and wrap it into a constant core query.
Constant score query is important, otherwise each clause will lead to score
calculation which has a significant impact on the overall search response
time.

There is also a notable difference of performance on AWS between "memory"
and "mmapfs" because AWS VMs are slow - this is another story, not related
to queries.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ce261303-00ac-40c3-88ad-4d296ebd9a07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7