Help with ES 1.x percolator query plz


(JGL) #1

I am trying to use the new version of percolator coming with ES 1.x. It
seems the new percolator only works with match query, which, to my
knowledge, does not accept multiple query strings. Wondering if there is a
way I could use a query string list in a query which can work with the new
percolator.

What I am trying to do is to register a query in the percolator which
matches the ID of each inbound document against a list of given IDs [id1,
id2, id3].

A work-around I found is adding all IDs into the query string, i.e., "id1
id2 id3", to form a match query like the following, so that any document
with one of the 3 IDs will make the percolator think this query as a match:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",
"type":"boolean"
}
}
}
}
}

It seems a dumb solution to me, wondering if anybody could help me come out
a more more elegant solution, for exampling using inFilter or inQuery or
anyway be able to pass the ID list as a list into a query that can work
with the new percolator.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb3ad7bf-a21e-4714-a53e-ea998a5ac9ef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #2

Can anybody help plz?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Martijn Van Groningen) #3

Hi,

Can you share the stored percolator queries and the percolate request that
you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.liu.mu@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TxcY%2BTB%2Btpg6C2Ujei5uHc3xXN67rGduEa4gR1c_PyNtg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #4

Hi Martjin,

The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",
"type":"boolean"
}
}
}
}
}

Another issue is that the above solution is not quite accurate when the IDs
are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",
"type":"boolean"
}
}
}
}
}

, the percolator return the above query as a match if the document we try
to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-00293d54b852"}}",
though we are expecting a no match response here as the id in the document
does not have a matched ID in the query String.

Such false positive response, according to the experimentations we had,
happens when the doc UUID is almost the same to one of the IDs in the query
except the the last part of ID. Wondering if there is an explanation for
such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list as a
list into a query that is working with the percolator, like what we can do
for inQuery or inFilter. We tried register an inQuery or a query wrapping
an inFilter. Non of them can work with the percolator, seems the percolator
only works with the MatchQuery, in which we cannot put the UUID list as a
list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request that
you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL <j.g.l...@gmail.com <javascript:>> wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/26418ee0-0bc5-4719-b8db-d193019ef67f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #5

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",
"type":"boolean"
}
}
}
}
}

Another issue is that the above solution is not quite accurate when the
IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",
"type":"boolean"
}
}
}
}
}

, the percolator return the above query as a match if the document we try
to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-00293d54b852"}}",
though we are expecting a no match response here as the id in the document
does not have a matched ID in the query String.

Such false positive response, according to the experimentations we had,
happens when the doc UUID is almost the same to one of the IDs in the query
except the the last part of ID. Wondering if there is an explanation for
such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list as
a list into a query that is working with the percolator, like what we can
do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request
that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/554c6588-68ed-4379-81c2-0847e5e8b62e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #6

Can anybody help plz?

On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",
"type":"boolean"
}
}
}
}
}

Another issue is that the above solution is not quite accurate when the
IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",
"type":"boolean"
}
}
}
}
}

, the percolator return the above query as a match if the document we try
to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-00293d54b852"}}",
though we are expecting a no match response here as the id in the document
does not have a matched ID in the query String.

Such false positive response, according to the experimentations we had,
happens when the doc UUID is almost the same to one of the IDs in the query
except the the last part of ID. Wondering if there is an explanation for
such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list as
a list into a query that is working with the percolator, like what we can
do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request
that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c0e5bf8-8790-48f3-9ce3-16b3764057b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #7

Can anybody help plz?

On Wednesday, May 7, 2014 6:29:35 PM UTC+12, JGL wrote:

Can anybody help plz?

On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",
"type":"boolean"
}
}
}
}
}

Another issue is that the above solution is not quite accurate when the
IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",
"type":"boolean"
}
}
}
}
}

, the percolator return the above query as a match if the document we
try to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-
00293d54b852"}}", though we are expecting a no match response here as
the id in the document does not have a matched ID in the query String.

Such false positive response, according to the experimentations we had,
happens when the doc UUID is almost the same to one of the IDs in the query
except the the last part of ID. Wondering if there is an explanation for
such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list
as a list into a query that is working with the percolator, like what we
can do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request
that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/355c1323-5b58-4ff1-9854-56740ac2ec34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Martijn Van Groningen) #8

I think the issue here is that the 'id' field is analyzed and your UUIDS
are broken up into separate tokens. The standard analyzer is responsible
for breaking up by '-'. If you use the analyze api you can see what happens
with your uuids:
curl -XGET
'localhost:9200/_analyze?text=1aa808dc-48f0-4de3-8978-a0293d54b852
6b256fd1-cd04-4e3c-8f38-aaa87ac2220d
1234fd1a-cd04-4e3c-8f38-aaa87142380d&tokenizer=standard'

The 'id' field in ES is not used as the id field. In ES the _id field is
used to store the unique identifier and that field is not analyzed.
Assuming that the 'id' field has the same value as the id of a document
then you can use the ids query instead in your percolator queries:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html#query-dsl-ids-query

Martijn

On 9 May 2014 09:20, JGL j.g.liu.mu@gmail.com wrote:

Can anybody help plz?

On Wednesday, May 7, 2014 6:29:35 PM UTC+12, JGL wrote:

Can anybody help plz?

On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

Another issue is that the above solution is not quite accurate when the
IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

, the percolator return the above query as a match if the document we
try to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-
00293d54b852"}}", though we are expecting a no match response here
as the id in the document does not have a matched ID in the query String.

Such false positive response, according to the experimentations we had,
happens when the doc UUID is almost the same to one of the IDs in the query
except the the last part of ID. Wondering if there is an explanation for
such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list
as a list into a query that is working with the percolator, like what we
can do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request
that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%
40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TzpG8bjBQ68Wy1cgneHta%3DLrxh-DWKtC3AFqfRxxLksHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #9

Hi Martijin,

Thanks for the reply. The analyzer breaking up the UUID explains a lot why
the UUIDs are not matched as a whole.

I am still wondering if we can register other types of queries other than
match query into percolator. We would like to put a list of values into a
query for the "id" field, which is meant as a device ID, so that when we
percolate a document with a device ID, all percolator queries whose ID list
contains the device ID can be considered as a match.

But according to our experimentation, queries like the following are not
working with percolator, which seems only happy with match queries:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

I could not find any resources clearly state that percolator can only work
with match queries. Is it actually the case?

Thanks,
Jason

On Friday, May 9, 2014 10:04:51 PM UTC+12, Martijn v Groningen wrote:

I think the issue here is that the 'id' field is analyzed and your UUIDS
are broken up into separate tokens. The standard analyzer is responsible
for breaking up by '-'. If you use the analyze api you can see what happens
with your uuids:
curl -XGET
'localhost:9200/_analyze?text=1aa808dc-48f0-4de3-8978-a0293d54b852
6b256fd1-cd04-4e3c-8f38-aaa87ac2220d
1234fd1a-cd04-4e3c-8f38-aaa87142380d&tokenizer=standard'

The 'id' field in ES is not used as the id field. In ES the _id field is
used to store the unique identifier and that field is not analyzed.
Assuming that the 'id' field has the same value as the id of a document
then you can use the ids query instead in your percolator queries:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html#query-dsl-ids-query

Martijn

On 9 May 2014 09:20, JGL <j.g.l...@gmail.com <javascript:>> wrote:

Can anybody help plz?

On Wednesday, May 7, 2014 6:29:35 PM UTC+12, JGL wrote:

Can anybody help plz?

On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to
the percolator and kinda working, which consolidate all IDs in one query
string for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

Another issue is that the above solution is not quite accurate when
the IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

, the percolator return the above query as a match if the document we
try to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-
00293d54b852"}}", though we are expecting a no match response here
as the id in the document does not have a matched ID in the query String.

Such false positive response, according to the experimentations we
had, happens when the doc UUID is almost the same to one of the IDs in the
query except the the last part of ID. Wondering if there is an explanation
for such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID list
as a list into a query that is working with the percolator, like what we
can do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate request
that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-
1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e4ae14dd-8ae5-43e1-85bb-2fe3d0a1a1ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Martijn Van Groningen) #10

The reason the that these percolator queries don't match, has nothing to do
with the percolator itself, but with text analysis. The default analyzer
for string fields just also breaks up the id by dash and the terms filter
and query require exact matches, which result in no matches. On the other
hand the match query is smart enough to check if a field is analyzed and
use the analyzer that is configured in the mapping, in that case the with
the match query the percolator query does match.

In your case if you want to make the terms filter/query match your document
being percolated you should do the following:
1)
curl -XPUT 'localhost:9200/my-index' -d '
{
"mappings": {
"my-type" : {
"properties": {
"id" : {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'

curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query": {
"terms": {
"id": [
"1aa808dc-48f0-4de3-8978-a0293d54b852",
"6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"
]
}
},
"type" : "my-type"
}'

By specifying that the type is 'my-type', the percolator will tell query
parsing to use not analyze the id field values and just take the values as
is.

curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{
"doc": {
"id": "1aa808dc-48f0-4de3-8978-a0293d54b852"
}
}'

On 16 May 2014 01:09, JGL j.g.liu.mu@gmail.com wrote:

Hi Martijin,

Thanks for the reply. The analyzer breaking up the UUID explains a lot why
the UUIDs are not matched as a whole.

I am still wondering if we can register other types of queries other than
match query into percolator. We would like to put a list of values into a
query for the "id" field, which is meant as a device ID, so that when we
percolate a document with a device ID, all percolator queries whose ID list
contains the device ID can be considered as a match.

But according to our experimentation, queries like the following are not
working with percolator, which seems only happy with match queries:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

I could not find any resources clearly state that percolator can only work
with match queries. Is it actually the case?

Thanks,
Jason

On Friday, May 9, 2014 10:04:51 PM UTC+12, Martijn v Groningen wrote:

I think the issue here is that the 'id' field is analyzed and your UUIDS
are broken up into separate tokens. The standard analyzer is responsible
for breaking up by '-'. If you use the analyze api you can see what happens
with your uuids:
curl -XGET 'localhost:9200/_analyze?text=1aa808dc-48f0-4de3-8978-a0293d54b852
6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-
aaa87142380d&tokenizer=standard'

The 'id' field in ES is not used as the id field. In ES the _id field is
used to store the unique identifier and that field is not analyzed.
Assuming that the 'id' field has the same value as the id of a document
then you can use the ids query instead in your percolator queries:
http://www.elasticsearch.org/guide/en/elasticsearch/
reference/current/query-dsl-ids-query.html#query-dsl-ids-query

Martijn

On 9 May 2014 09:20, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

On Wednesday, May 7, 2014 6:29:35 PM UTC+12, JGL wrote:

Can anybody help plz?

On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:

Can anybody help plz?

On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:

Hi Martjin,

The percolator query in the 1st post above is what we registered to
the percolator and kinda working, which consolidate all IDs in one query
string for a match query, which seems not quite a elegant solution to us.

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"id1 id2 id3",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

Another issue is that the above solution is not quite accurate when
the IDs are UUIDs. For example, if the query we register is as the following

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "my_query_id",
"_score" : 1.0,
"_source" : {
"query":{
"match":{
"id":{
"query":"1aa808dc-48f0-4de3-8978-a0293d54b852 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-aaa87142380d",

                              "type":"boolean"
                               }
                           }
                    }
              }

}

, the percolator return the above query as a match if the document we
try to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-
00293d54b852"}}", though we are expecting a no match response here
as the id in the document does not have a matched ID in the query String.

Such false positive response, according to the experimentations we
had, happens when the doc UUID is almost the same to one of the IDs in the
query except the the last part of ID. Wondering if there is an explanation
for such behavior of elasticsearch?

Our another question is if there is any way we could put the UUID
list as a list into a query that is working with the percolator, like what
we can do for inQuery or inFilter. We tried register an inQuery or a query
wrapping an inFilter. Non of them can work with the percolator, seems the
percolator only works with the MatchQuery, in which we cannot put the UUID
list as a list.

For example the following two queries we tried are not working with
percolator:

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inQuery",
"_score" : 1.0, "_source" : {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
},

{
"_index" : "my_idx",
"_type" : ".percolator",
"_id" : "inFilterQ",
"_score" : 1.0, "_source" : {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
},

Thanks for your help!

Jason

On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:

Hi,

Can you share the stored percolator queries and the percolate
request that you were initially trying with, but didn't work?\

Martijn

On 2 May 2014 11:14, JGL j.g.l...@gmail.com wrote:

Can anybody help plz?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ee60836-192
2-43e0-8d9b-64ef9bb0b00a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tzc0b0ddpNAg31Y2sPw5tj9Xr8aXts4GRvORnOzhaxyQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(JGL) #11

Hi Martijn,

It works!

Thank you so much for the help!

Thanks,
Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed084b17-a20b-4c9b-8ffb-c779d772b3c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #12