ElasticSearch lucene queries VERSUS ElasticSearch query string

Hi,

I tried running Elastic search queries, in the Lucene-query format and also
in the query string format.
I found that the lucene queries were considerably faster than the
query-string queries, approx. 4 times faster. Can some one pls throw light
on why is it so

Thanks,
John

--

What do you mean by "lucene-query" format? The Elasticsearch Query String
Query accepts lucene query syntax, ie. "this AND that OR whatever". Were
you talking about using that format vs. crafting the underlying query
objects

BOOLEAN

On Tue, Jan 22, 2013 at 8:13 AM, john john2roll2@gmail.com wrote:

Hi,

I tried running Elastic search queries, in the Lucene-query format and
also in the query string format.
I found that the lucene queries were considerably faster than the
query-string queries, approx. 4 times faster. Can some one pls throw light
on why is it so

Thanks,
John

--

--

Didn't get to finish that:

BOOLEAN
MUST this
MUST that
SHOULD whatever

?

On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber matt.weber@gmail.com wrote:

What do you mean by "lucene-query" format? The Elasticsearch Query String
Query accepts lucene query syntax, ie. "this AND that OR whatever". Were
you talking about using that format vs. crafting the underlying query
objects

BOOLEAN

On Tue, Jan 22, 2013 at 8:13 AM, john john2roll2@gmail.com wrote:

Hi,

I tried running Elastic search queries, in the Lucene-query format and
also in the query string format.
I found that the lucene queries were considerably faster than the
query-string queries, approx. 4 times faster. Can some one pls throw light
on why is it so

Thanks,
John

--

--

By lucene query format i mean using the java API for Elasticsearch. The
API allows to build query as a combination of queries and filters, the way
you mentioned below.
Before that i was query strings.

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:

Didn't get to finish that:

BOOLEAN
MUST this
MUST that
SHOULD whatever

?

On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber <matt....@gmail.com<javascript:>

wrote:

What do you mean by "lucene-query" format? The Elasticsearch Query
String Query accepts lucene query syntax, ie. "this AND that OR whatever".
Were you talking about using that format vs. crafting the underlying query
objects

BOOLEAN

On Tue, Jan 22, 2013 at 8:13 AM, john <john2...@gmail.com <javascript:>>wrote:

Hi,

I tried running Elastic search queries, in the Lucene-query format and
also in the query string format.
I found that the lucene queries were considerably faster than the
query-string queries, approx. 4 times faster. Can some one pls throw light
on why is it so

Thanks,
John

--

--

On Tue, 2013-01-22 at 10:13 -0800, john wrote:

By lucene query format i mean using the java API for Elasticsearch.
The API allows to build query as a combination of queries and filters,
the way you mentioned below.
Before that i was query strings.

It'd be a lot easier to figure out what you were comparing if you
provided the code

clint

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:
Didn't get to finish that:

    BOOLEAN
        MUST this
        MUST that
        SHOULD whatever
    
    
    ?
    
    
    On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber
    <matt....@gmail.com> wrote:
            What do you mean by "lucene-query" format?  The
            ElasticSearch Query String Query accepts lucene query
            syntax, ie. "this AND that OR whatever".  Were you
            talking about using that format vs. crafting the
            underlying query objects 
            
            
            BOOLEAN
            
            
            
            
            On Tue, Jan 22, 2013 at 8:13 AM, john
            <john2...@gmail.com> wrote:
                    Hi,
                    
                    
                    I tried running Elastic search queries, in the
                    Lucene-query format and also in the query
                    string format.
                    I found that the lucene queries were
                    considerably faster than the query-string
                    queries, approx. 4 times faster. Can some one
                    pls throw light on why is it so
                    
                    
                    Thanks,
                    John

--

Clint, Sorry for replying so late. Had gone on a vacation.

This is what i mean by Elasticsearch query string:

{"fields":["X.price","X.name","X.marketValue","X.salePrice"],"from":0,"size":10,"sort":[{"X.price.alphanum":"asc"}],"filter":{"query":{"query_string":{"default_operator":"AND","query":"X.Y:false
AND X.marketplace:false AND X.active:true AND X.Z:false AND
X.startDate:[1910-01-01 TO 2013-01-29] AND X.end:true AND (((X.name:floor*
OR X.manufacturer:floor* OR X.desc:floor* OR X.desc:floor* OR
X.futures.feature:floor* OR X.details.value:floor*)))"}}}}

And this is what i mean by Lucene queries. And the queries created oin the
manner below are a lot faster than i i were to create a query string for
the same conditions.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"X.id": "*"
}
}
}
}
}
},
"fields": [
"X.id",
],
"sort": [
{
"X.id": {
"order": "asc"
}
}
]
}

John

On Tuesday, 22 January 2013 12:23:24 UTC-6, Clinton Gormley wrote:

On Tue, 2013-01-22 at 10:13 -0800, john wrote:

By lucene query format i mean using the java API for Elasticsearch.
The API allows to build query as a combination of queries and filters,
the way you mentioned below.
Before that i was query strings.

It'd be a lot easier to figure out what you were comparing if you
provided the code

clint

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:
Didn't get to finish that:

    BOOLEAN 
        MUST this 
        MUST that 
        SHOULD whatever 
    
    
    ? 
    
    
    On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber 
    <matt....@gmail.com> wrote: 
            What do you mean by "lucene-query" format?  The 
            ElasticSearch Query String Query accepts lucene query 
            syntax, ie. "this AND that OR whatever".  Were you 
            talking about using that format vs. crafting the 
            underlying query objects 
            
            
            BOOLEAN 
            
            
            
            
            On Tue, Jan 22, 2013 at 8:13 AM, john 
            <john2...@gmail.com> wrote: 
                    Hi, 
                    
                    
                    I tried running Elastic search queries, in the 
                    Lucene-query format and also in the query 
                    string format. 
                    I found that the lucene queries were 
                    considerably faster than the query-string 
                    queries, approx. 4 times faster. Can some one 
                    pls throw light on why is it so 
                    
                    
                    Thanks, 
                    John 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ignoring the differences in the query itself (e.g. they query different
things), I believe the performance difference is more related to your
filter choice. In the top query, you are using a "Query Filter", which
just wraps a query with a filter so you can cache the returned results.
This can be useful if you hit the same query over and over, where it makes
sense to just save the output instead of processing the request. However,
you don't gain any filter search-time performance. You are effectively
running a really big query string against all the docs.

The lower query uses a "Filtered Query", which first takes your corpus of
documents and discards those that don't match the filter. Whatever
leftover is then searched with your query. This is very fast because the
filters are cached in memory, so you quickly reduce the size of documents
that need to be analyzed with your query.

Perhaps I'm misunderstanding your question. Curious to see what Clint says.

-Zach

On Wednesday, January 30, 2013 4:42:15 PM UTC-5, john wrote:

Clint, Sorry for replying so late. Had gone on a vacation.

This is what i mean by Elasticsearch query string:

{"fields":["X.price","X.name","X.marketValue","X.salePrice"],"from":0,"size":10,"sort":[{"X.price.alphanum":"asc"}],"filter":{"query":{"query_string":{"default_operator":"AND","query":"X.Y:false
AND X.marketplace:false AND X.active:true AND X.Z:false AND
X.startDate:[1910-01-01 TO 2013-01-29] AND X.end:true AND (((X.name:floor*
OR X.manufacturer:floor* OR X.desc:floor* OR X.desc:floor* OR
X.futures.feature:floor* OR X.details.value:floor*)))"}}}}

And this is what i mean by Lucene queries. And the queries created oin the
manner below are a lot faster than i i were to create a query string for
the same conditions.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"X.id": "*"
}
}
}
}
}
},
"fields": [
"X.id",
],
"sort": [
{
"X.id": {
"order": "asc"
}
}
]
}

John

On Tuesday, 22 January 2013 12:23:24 UTC-6, Clinton Gormley wrote:

On Tue, 2013-01-22 at 10:13 -0800, john wrote:

By lucene query format i mean using the java API for Elasticsearch.
The API allows to build query as a combination of queries and filters,
the way you mentioned below.
Before that i was query strings.

It'd be a lot easier to figure out what you were comparing if you
provided the code

clint

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:
Didn't get to finish that:

    BOOLEAN 
        MUST this 
        MUST that 
        SHOULD whatever 
    
    
    ? 
    
    
    On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber 
    <matt....@gmail.com> wrote: 
            What do you mean by "lucene-query" format?  The 
            ElasticSearch Query String Query accepts lucene query 
            syntax, ie. "this AND that OR whatever".  Were you 
            talking about using that format vs. crafting the 
            underlying query objects 
            
            
            BOOLEAN 
            
            
            
            
            On Tue, Jan 22, 2013 at 8:13 AM, john 
            <john2...@gmail.com> wrote: 
                    Hi, 
                    
                    
                    I tried running Elastic search queries, in the 
                    Lucene-query format and also in the query 
                    string format. 
                    I found that the lucene queries were 
                    considerably faster than the query-string 
                    queries, approx. 4 times faster. Can some one 
                    pls throw light on why is it so 
                    
                    
                    Thanks, 
                    John 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You're right the lower queries are hell lot faster. approx 4 times faster
than the top query. I just wanted to know what makes them different and why
the lower ones are aster.
The top queries have to be created manually and the lower ones using the
Java API for Elasticsearch.

On Wednesday, 30 January 2013 16:11:14 UTC-6, Zachary Tong wrote:

Ignoring the differences in the query itself (e.g. they query different
things), I believe the performance difference is more related to your
filter choice. In the top query, you are using a "Query Filter", which
just wraps a query with a filter so you can cache the returned results.
This can be useful if you hit the same query over and over, where it makes
sense to just save the output instead of processing the request. However,
you don't gain any filter search-time performance. You are effectively
running a really big query string against all the docs.

The lower query uses a "Filtered Query", which first takes your corpus of
documents and discards those that don't match the filter. Whatever
leftover is then searched with your query. This is very fast because the
filters are cached in memory, so you quickly reduce the size of documents
that need to be analyzed with your query.

Perhaps I'm misunderstanding your question. Curious to see what Clint
says.

-Zach

On Wednesday, January 30, 2013 4:42:15 PM UTC-5, john wrote:

Clint, Sorry for replying so late. Had gone on a vacation.

This is what i mean by Elasticsearch query string:

{"fields":["X.price","X.name","X.marketValue","X.salePrice"],"from":0,"size":10,"sort":[{"X.price.alphanum":"asc"}],"filter":{"query":{"query_string":{"default_operator":"AND","query":"X.Y:false
AND X.marketplace:false AND X.active:true AND X.Z:false AND
X.startDate:[1910-01-01 TO 2013-01-29] AND X.end:true AND (((X.name:floor*
OR X.manufacturer:floor* OR X.desc:floor* OR X.desc:floor* OR
X.futures.feature:floor* OR X.details.value:floor*)))"}}}}

And this is what i mean by Lucene queries. And the queries created oin
the manner below are a lot faster than i i were to create a query string
for the same conditions.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"X.id": "*"
}
}
}
}
}
},
"fields": [
"X.id",
],
"sort": [
{
"X.id": {
"order": "asc"
}
}
]
}

John

On Tuesday, 22 January 2013 12:23:24 UTC-6, Clinton Gormley wrote:

On Tue, 2013-01-22 at 10:13 -0800, john wrote:

By lucene query format i mean using the java API for Elasticsearch.
The API allows to build query as a combination of queries and filters,
the way you mentioned below.
Before that i was query strings.

It'd be a lot easier to figure out what you were comparing if you
provided the code

clint

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:
Didn't get to finish that:

    BOOLEAN 
        MUST this 
        MUST that 
        SHOULD whatever 
    
    
    ? 
    
    
    On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber 
    <matt....@gmail.com> wrote: 
            What do you mean by "lucene-query" format?  The 
            ElasticSearch Query String Query accepts lucene query 
            syntax, ie. "this AND that OR whatever".  Were you 
            talking about using that format vs. crafting the 
            underlying query objects 
            
            
            BOOLEAN 
            
            
            
            
            On Tue, Jan 22, 2013 at 8:13 AM, john 
            <john2...@gmail.com> wrote: 
                    Hi, 
                    
                    
                    I tried running Elastic search queries, in the 
                    Lucene-query format and also in the query 
                    string format. 
                    I found that the lucene queries were 
                    considerably faster than the query-string 
                    queries, approx. 4 times faster. Can some one 
                    pls throw light on why is it so 
                    
                    
                    Thanks, 
                    John 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

QueryString is a Query. It's analyzed for each field involve. And score is computed.
TermFilter is a Filter. It's not analyzed. No scoring phase.

+1 for Zachary wrote about caching.

That's for the what mainly explain the speed difference between the two.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 30 janv. 2013 à 23:27, john john2roll2@gmail.com a écrit :

You're right the lower queries are hell lot faster. approx 4 times faster than the top query. I just wanted to know what makes them different and why the lower ones are aster.
The top queries have to be created manually and the lower ones using the Java API for Elasticsearch.

On Wednesday, 30 January 2013 16:11:14 UTC-6, Zachary Tong wrote:

Ignoring the differences in the query itself (e.g. they query different things), I believe the performance difference is more related to your filter choice. In the top query, you are using a "Query Filter", which just wraps a query with a filter so you can cache the returned results. This can be useful if you hit the same query over and over, where it makes sense to just save the output instead of processing the request. However, you don't gain any filter search-time performance. You are effectively running a really big query string against all the docs.

The lower query uses a "Filtered Query", which first takes your corpus of documents and discards those that don't match the filter. Whatever leftover is then searched with your query. This is very fast because the filters are cached in memory, so you quickly reduce the size of documents that need to be analyzed with your query.

Perhaps I'm misunderstanding your question. Curious to see what Clint says.

-Zach

On Wednesday, January 30, 2013 4:42:15 PM UTC-5, john wrote:

Clint, Sorry for replying so late. Had gone on a vacation.

This is what i mean by Elasticsearch query string:

{"fields":["X.price","X.name","X.marketValue","X.salePrice"],"from":0,"size":10,"sort":[{"X.price.alphanum":"asc"}],"filter":{"query":{"query_string":{"default_operator":"AND","query":"X.Y:false AND X.marketplace:false AND X.active:true AND X.Z:false AND X.startDate:[1910-01-01 TO 2013-01-29] AND X.end:true AND (((X.name:floor* OR X.manufacturer:floor* OR X.desc:floor* OR X.desc:floor* OR X.futures.feature:floor* OR X.details.value:floor*)))"}}}}

And this is what i mean by Lucene queries. And the queries created oin the manner below are a lot faster than i i were to create a query string for the same conditions.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"X.id": "*"
}
}
}
}
}
},
"fields": [
"X.id",
],
"sort": [
{
"X.id": {
"order": "asc"
}
}
]
}

John

On Tuesday, 22 January 2013 12:23:24 UTC-6, Clinton Gormley wrote:

On Tue, 2013-01-22 at 10:13 -0800, john wrote:

By lucene query format i mean using the java API for Elasticsearch.
The API allows to build query as a combination of queries and filters,
the way you mentioned below.
Before that i was query strings.

It'd be a lot easier to figure out what you were comparing if you
provided the code

clint

On Tuesday, January 22, 2013 10:17:54 AM UTC-6, Matt Weber wrote:
Didn't get to finish that:

    BOOLEAN 
        MUST this 
        MUST that 
        SHOULD whatever 
    
    
    ? 
    
    
    On Tue, Jan 22, 2013 at 8:16 AM, Matt Weber 
    <matt....@gmail.com> wrote: 
            What do you mean by "lucene-query" format?  The 
            ElasticSearch Query String Query accepts lucene query 
            syntax, ie. "this AND that OR whatever".  Were you 
            talking about using that format vs. crafting the 
            underlying query objects 
            
            
            BOOLEAN 
            
            
            
            
            On Tue, Jan 22, 2013 at 8:13 AM, john 
            <john2...@gmail.com> wrote: 
                    Hi, 
                    
                    
                    I tried running Elastic search queries, in the 
                    Lucene-query format and also in the query 
                    string format. 
                    I found that the lucene queries were 
                    considerably faster than the query-string 
                    queries, approx. 4 times faster. Can some one 
                    pls throw light on why is it so 
                    
                    
                    Thanks, 
                    John 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Wed, 2013-01-30 at 14:27 -0800, john wrote:

You're right the lower queries are hell lot faster. approx 4 times
faster than the top query. I just wanted to know what makes them
different and why the lower ones are aster.
The top queries have to be created manually and the lower ones using
the Java API for Elasticsearch.

I've rewritten your query_string query to use filters, each of which
will be cached. See how this one performs.

(Note, wildcards are not an optimal way to query large amounts of data.
Much better to use edge ngrams)

curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"and" : [
{
"term" : {
"X.Y" : 0
}
},
{
"term" : {
"X.Z" : 0
}
},
{
"X.active" : 1
},
{
"term" : {
"X.end" : 1
}
},
{
"term" : {
"X.marketplace" : 0
}
},
{
"range" : {
"X.startDate" : {
"lte" : "2013-01-29",
"gte" : "1910-01-01"
}
}
}
]
},
"query" : {
"multi_match" : {
"fields" : [
"X.name",
"X.manufacturer",
"X.desc",
"X.futures.feature",
"X.details.value"
],
"query" : "floor",
"type" : "phrase_prefix"
}
}
}
}
}
'
clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.