Filter percolators before percolation query

Percolation changed a bit coming from 2.4 to 5.2. In 2.4 we had the opportunity to filter the percolators that were used before the percolation actually took place (see Additional supported query string options under Percolator API in 2.4).

In the 5.2 Percolator Query documentation I can't seem to find a way of filtering the percolators first.

I think it's probably done with some clever combination of other filter but I can't seem to find out how to do it. Any tips?

HI Frank,
given that percolate is now a query, you can just use the query as part of a bool query which also has a filter clause. That one will filter the documents being percolated.

Also, we now index terms extracted from most of the queries, so we perform pre-filtering internally even if you don't provide any.

Cheers
Luca

Thanks for your quick reply but what I was looking for was a way to filter the percolators that are applied not the documents being percolated. The link to the docs section on Additional supported query string options at the Percolator API in 2.4 shows that nicely

This was existing functionality in ES 2.4 and I can't imagine that it was thrown out because it's very useful to limit the amount of resources needed to percolate. Anyone else?

To clarify, an example query would be

{
  "doc": {
    "created_at": "2010-10-10T00:00:00",
    "message": "some text"
  },
  "filter": {
    "term": {
      "percolator_group_name": "category"
    }
  }
}

that way we could percolate only categories (and not all other percolate groups like, for example tags). The percolator group name would be just a non-analyzed string (in 2.4) field in the .percolator index. I can still put in the extra field in 5.x (as keyword type) but I can't seem to find the filter option as shown in the query above to filter for only specific percolators

Just upgraded to 5.2 and I knew that I had to migrate to percolate query but just as you, I couldn't find a way to filter the percolators by meta data (this is practically a showstopper for us).

Did you find a solution?

Thanks,
joe

I don't know percolation very well but I expect something like this will work:

Modify the doc:

{
    "created_at": "2010-10-10T00:00:00",
    "message": "some text",
    "percolator_group_name": "some name"
}

Modify the percolator:

{
   "bool": {
      "must": [
         {"term": {"percolator_group_name": "some name"}},
         {your original percolator}
      ]
   }
}

With the term extraction this ought to filter the attempted percolators. You don't get to use the same document, but it ought to work.

The example that @nik9000 has shared is the way to prefilter percolator queries based on their metadata before percolation. This example in 5.x results in the same behavior with the search api as @frankkoornstra example in 2.x with the percolate api.

Thanks @nik9000 and @mvg , that would technically work.

My use case: the end users define the percolator queries and on save I attach meta data to it, like company (this will also be on the document), consern/corporate (not on the doc), region (not on the doc), the user creating the percolator (not on the doc) etc. This way the user can maintain the query and the application the meta data - very straight forward and elegant.

The suggested 5.x way requires me altering both the user's query (but show only the user's part to the user in the UI editor) and the document - doable but feels forced.

Any chance of re-implementing meta data on percolator? :slight_smile:

@Joejoe The use case you describe doesn't require you to alter the user's query. In fact adding metadata to a percolator query remained the same between 2.x and 5.x.

The main difference here is that a percolate time you use the percolate query in the search api. The percolate query doesn't contain the users query, but the document you like to percolate. If in addition you also like to filter percolator queries on previously attached metadata then you change the search request body that have a bool query with two or more must clauses. One clause for the percolate query and other must clauses for each metadata property you like to filter on. Similar how you would define filters in the 2.x percolate api.

Can you share rest / curl examples of steps you think you need to do? Then I get a better idea what you mean. I think we are somehow misunderstanding each other.

@mvg thanks for following up. I'm travelling now but I'll get back with concrete example on monday.

@nik9000 @mvg thanks! Now that I see it, it seems so straightforward :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.