Trying to filter percolators, what am I doing wrong?


(Brett Anderson) #1

I interpreted the percolator filter feature shown here http://www.elasticsearch.org/guide/reference/api/percolate.htmlto
mean that I could give percolators additional terms and then supply a query
on those terms when percolating a document. This would first retrieve the
subset of percolators that match the query, then percolate the document
against that set. In my example https://gist.github.com/2632887 I create
two percolators with a term called 'color', set to 'blue' for both. Each
percolator has a query on a field called 'sport', one for the word
'sailing', the other for the word 'tennis'. If I percolate a document with
the sport field set to 'tennis', ES correctly returns a match on the tennis
percolator alone. If however I percolate the same doc with an additional
query to filter the percolators, a match for both percolators is returned.
I've found this to occur in both 0.19.2 and 0.19.3. Is my understanding of
this feature correct? If so what is wrong with my application of it, or is
this a bug?

As mentioned above, the example: https://gist.github.com/2632887,

Thanks,
LJ.


(Shay Banon) #2

Its a bug..., opened an issue:
https://github.com/elasticsearch/elasticsearch/issues/1925, will be fixed
shortly on both 0.19 branch and master.

On Tue, May 8, 2012 at 9:10 AM, Laser Jesus brett.anderson.ftw@gmail.comwrote:

I interpreted the percolator filter feature shown here http://www.elasticsearch.org/guide/reference/api/percolate.htmlto
mean that I could give percolators additional terms and then supply a query
on those terms when percolating a document. This would first retrieve the
subset of percolators that match the query, then percolate the document
against that set. In my example https://gist.github.com/2632887 I
create two percolators with a term called 'color', set to 'blue' for both.
Each percolator has a query on a field called 'sport', one for the word
'sailing', the other for the word 'tennis'. If I percolate a document with
the sport field set to 'tennis', ES correctly returns a match on the tennis
percolator alone. If however I percolate the same doc with an additional
query to filter the percolators, a match for both percolators is returned.
I've found this to occur in both 0.19.2 and 0.19.3. Is my understanding of
this feature correct? If so what is wrong with my application of it, or is
this a bug?

As mentioned above, the example: https://gist.github.com/2632887,

Thanks,
LJ.


(Brett Anderson) #3

Thanks for the quick solution. Just retried with the latest for 0.19.4 and
it's working nicely.


(Brett Anderson) #4

I'm having some more issues with this bug under a more constrained
environment. If you run my gist here https://gist.github.com/2632887, except
the last line which deletes the index, but making sure all the lines before
it are executed in a single time interval, on a fresh ES 19.4 instance with
no data or logs folders, you get a strange result. The last percolation
request, which uses percolator filtering, does not match. If you re-run it
a second later it matches correctly. If you delete the index and then run
the script again, in a single time interval, the last percolator does
match. I stumbled on this one when some of my unit tests worked when I had
a break-point, but failed without it...


(Shay Banon) #5

What happens if you add a POST to refresh the _percolator index after you
registered the queries? Does that solve it?

On Wed, May 16, 2012 at 7:34 AM, Laser Jesus
brett.anderson.ftw@gmail.comwrote:

I'm having some more issues with this bug under a more constrained

environment. If you run my gist here https://gist.github.com/2632887, except
the last line which deletes the index, but making sure all the lines before
it are executed in a single time interval, on a fresh ES 19.4 instance with
no data or logs folders, you get a strange result. The last percolation
request, which uses percolator filtering, does not match. If you re-run it
a second later it matches correctly. If you delete the index and then run
the script again, in a single time interval, the last percolator does
match. I stumbled on this one when some of my unit tests worked when I had
a break-point, but failed without it...


(Brett Anderson) #6

Do you mean this:

curl -XPOST localhost:9200/_percolator/_flush

Because yes, that fixed it :slight_smile:


(Shay Banon) #7

No, actually, flush is not really relevant, refresh is the one we care
about.

On Thu, May 17, 2012 at 3:04 AM, Laser Jesus
brett.anderson.ftw@gmail.comwrote:

Do you mean this:

curl -XPOST localhost:9200/_percolator/_flush

Because yes, that fixed it :slight_smile:


(Brett Anderson) #8

Ok, just changed it to refresh and all unit tests still working. Thanks.


(system) #9