Percolation changed a bit coming from 2.4 to 5.2. In 2.4 we had the opportunity to filter the percolators that were used before the percolation actually took place (see Additional supported query string options under Percolator API in 2.4).
In the 5.2 Percolator Query documentation I can't seem to find a way of filtering the percolators first.
I think it's probably done with some clever combination of other filter but I can't seem to find out how to do it. Any tips?
HI Frank,
given that percolate is now a query, you can just use the query as part of a bool query which also has a filter clause. That one will filter the documents being percolated.
Also, we now index terms extracted from most of the queries, so we perform pre-filtering internally even if you don't provide any.
Thanks for your quick reply but what I was looking for was a way to filter the percolators that are applied not the documents being percolated. The link to the docs section on Additional supported query string options at the Percolator API in 2.4 shows that nicely
This was existing functionality in ES 2.4 and I can't imagine that it was thrown out because it's very useful to limit the amount of resources needed to percolate. Anyone else?
that way we could percolate only categories (and not all other percolate groups like, for example tags). The percolator group name would be just a non-analyzed string (in 2.4) field in the .percolator index. I can still put in the extra field in 5.x (as keyword type) but I can't seem to find the filter option as shown in the query above to filter for only specific percolators
Just upgraded to 5.2 and I knew that I had to migrate to percolate query but just as you, I couldn't find a way to filter the percolators by meta data (this is practically a showstopper for us).
The example that @nik9000 has shared is the way to prefilter percolator queries based on their metadata before percolation. This example in 5.x results in the same behavior with the search api as @frankkoornstra example in 2.x with the percolate api.
Thanks @nik9000 and @mvg , that would technically work.
My use case: the end users define the percolator queries and on save I attach meta data to it, like company (this will also be on the document), consern/corporate (not on the doc), region (not on the doc), the user creating the percolator (not on the doc) etc. This way the user can maintain the query and the application the meta data - very straight forward and elegant.
The suggested 5.x way requires me altering both the user's query (but show only the user's part to the user in the UI editor) and the document - doable but feels forced.
Any chance of re-implementing meta data on percolator?
@Joejoe The use case you describe doesn't require you to alter the user's query. In fact adding metadata to a percolator query remained the same between 2.x and 5.x.
The main difference here is that a percolate time you use the percolate query in the search api. The percolate query doesn't contain the users query, but the document you like to percolate. If in addition you also like to filter percolator queries on previously attached metadata then you change the search request body that have a bool query with two or more must clauses. One clause for the percolate query and other must clauses for each metadata property you like to filter on. Similar how you would define filters in the 2.x percolate api.
Can you share rest / curl examples of steps you think you need to do? Then I get a better idea what you mean. I think we are somehow misunderstanding each other.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.