How do I create this query? Q = date > X and ("term" in body OR "term" in title)


(D) #1

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have a
mapping for a type like this {created: date, title: string, body: string}
How do I search for all documents created after a certain date that contain
a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
'...' # This is where I'm confused
}

--


(David Pilato) #2

Have a look at the boolFilter or boolQuery
http://www.elasticsearch.org/guide/reference/query-dsl/bool-filter.html
http://www.elasticsearch.org/guide/reference/query-dsl/bool-filter.html
http://www.elasticsearch.org/guide/reference/query-dsl/bool-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/bool-query.html

HTH
David.

Le 29 août 2012 à 17:02, D dougk7@gmail.com a écrit :

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have a mapping
for a type like this {created: date, title: string, body: string}
How do I search for all documents created after a certain date that contain a
certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
'...' # This is where I'm confused
}

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(D) #3

Thanks David,

That's exactly what I needed!

On Wednesday, August 29, 2012 5:02:18 PM UTC+2, D wrote:

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have a
mapping for a type like this {created: date, title: string, body: string}
How do I search for all documents created after a certain date
that contain a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
'...' # This is where I'm confused
}

--


(D) #4

Filtered query does it as well, in fact it's what I had in mind:

{"query": {
"filtered": {
"query": {"query_string": {"fields": ["title", "body"], "query": "term"}},
"filter": {"range": {"created": {"gt": "2012-01-01T12:00:00"}}}
}
}}

On Wednesday, August 29, 2012 5:02:18 PM UTC+2, D wrote:

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have a
mapping for a type like this {created: date, title: string, body: string}
How do I search for all documents created after a certain date
that contain a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
'...' # This is where I'm confused
}

--


(phill) #5

I believe you could have also done the same thing with a query the was
PRE filtered which is what search request with a query and a filter
would give you.
{ "query": { ... }, "filter " : { ...} }
I don't believe the order of query and filter matters in the above. The
hits are NO different than filtered query in your case (because your two
clauses "and" together regardless of order). The
only difference is when calculating "facets", a facets would NOT see
the results of a query-level filter, while the filtered query (filter
inside the query) will effect any facets.

http://www.elasticsearch.org/guide/reference/api/search/filter.html

[Before adding a query level filter, "[w]e get two hits, and the
relevant facets with a count of 1 for both |green| and |blue [...|] add
a filter [...] And now, we get only 1 hit back, but the facets remain
the same."

So if you're not using facets you were on the right track initially.

-Paul

On 8/29/2012 9:27 AM, D wrote:

Filtered query does it as well, in fact it's what I had in mind:

{"query": {
"filtered": {
"query": {"query_string": {"fields": ["title", "body"], "query": "term"}},
"filter": {"range": {"created": {"gt": "2012-01-01T12:00:00"}}}
}
}}

On Wednesday, August 29, 2012 5:02:18 PM UTC+2, D wrote:

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have
a mapping for a type like this {created: date, title: string,
body: string}
How do I search for all documents created after a certain date
that contain a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
  'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
  '...' # This is where I'm confused
}

--

--


(David Pilato) #6

I add to this that the other main difference is the speed execution I think.
Using filters reduce the number of documents to search on.
Plus, you don't compute scoring with filters.

My opinion is it's best to have a matchAll with a filter than a query.

But, I could be wrong.

David.

Le 29 août 2012 à 19:01, "P. Hill" parehill1@gmail.com a écrit :

I believe you could have also done the same thing with a query the was
PRE filtered which is what search request with a query and a filter
would give you.
{ "query": { ... }, "filter " : { ...} }
I don't believe the order of query and filter matters in the above. The
hits are NO different than filtered query in your case (because your two
clauses "and" together regardless of order). The
only difference is when calculating "facets", a facets would NOT see
the results of a query-level filter, while the filtered query (filter
inside the query) will effect any facets.

http://www.elasticsearch.org/guide/reference/api/search/filter.html

[Before adding a query level filter, "[w]e get two hits, and the
relevant facets with a count of 1 for both |green| and |blue [...|] add
a filter [...] And now, we get only 1 hit back, but the facets remain
the same."

So if you're not using facets you were on the right track initially.

-Paul

On 8/29/2012 9:27 AM, D wrote:

Filtered query does it as well, in fact it's what I had in mind:

{"query": {
"filtered": {
"query": {"query_string": {"fields": ["title", "body"], "query": "term"}},
"filter": {"range": {"created": {"gt": "2012-01-01T12:00:00"}}}
}
}}

On Wednesday, August 29, 2012 5:02:18 PM UTC+2, D wrote:

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have
a mapping for a type like this {created: date, title: string,
body: string}
How do I search for all documents created after a certain date
that contain a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
  'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
  '...' # This is where I'm confused
}

--

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(Shay Banon) #7

Just to note, that at least currently, using a top level filter when you don't really need to (i.e. no facets, or no need for facets) will be slower. We can do better and automatically identify that for example one doesn't use facets, and optimize the top level filter into a filtered query, as that execution will be considerably faster.

On Aug 29, 2012, at 8:01 PM, P. Hill parehill1@gmail.com wrote:

I believe you could have also done the same thing with a query the was PRE filtered which is what search request with a query and a filter would give you.
{ "query": { ... }, "filter " : { ...} }
I don't believe the order of query and filter matters in the above. The hits are NO different than filtered query in your case (because your two clauses "and" together regardless of order). The
only difference is when calculating "facets", a facets would NOT see the results of a query-level filter, while the filtered query (filter inside the query) will effect any facets.

http://www.elasticsearch.org/guide/reference/api/search/filter.html

[Before adding a query level filter, "[w]e get two hits, and the relevant facets with a count of 1 for both |green| and |blue [...|] add a filter [...] And now, we get only 1 hit back, but the facets remain the same."

So if you're not using facets you were on the right track initially.

-Paul

On 8/29/2012 9:27 AM, D wrote:

Filtered query does it as well, in fact it's what I had in mind:

{"query": {
"filtered": {
"query": {"query_string": {"fields": ["title", "body"], "query": "term"}},
"filter": {"range": {"created": {"gt": "2012-01-01T12:00:00"}}}
}
}}

On Wednesday, August 29, 2012 5:02:18 PM UTC+2, D wrote:

Hi,

I'm new to Elasticsearch and a bit confusted the Query DSL. I have
a mapping for a type like this {created: date, title: string,
body: string}
How do I search for all documents created after a certain date
that contain a certain string in either the body or the title.

So something like Q = date > X and ("term" in body OR "term" in title)

{
'filter': {'range': {'created': {'gt': "2012-01-01T12:00:00"}}},
'...' # This is where I'm confused
}

--

--

--


(phill) #8

I think I misstated filtering when I said "PRE" filtered.

Am I right in that both uses of the filter (in a filtered query and as a top-level filter) are applied after the query, so there is no "pre filtering" as a stated?

Since they both come after the query, why then is filtered query faster? Is it just because in a filtered query, there is no need to create a separate set of doc ids for (possible) later use in the facet, just before applying the filter? Or does it get messier than that? (No need for complete details just some comments or warnings about where the additional overhead comes from).

thanks,
-Paul


(system) #9