Understanding Percolator query extraction

Hi, all!

Fighting with performance of the percolate query.

The first thing I need to understand is the conditions when terms can be extracted from the query during the indexing phase.

What I need to understand is whether terms will be extracted from the bool query in its filter part?

For example, giving this mapping:

curl "localhost:9200/queries/_mapping?pretty"
{
  "queries": {
    "mappings": {
      "_doc": {
        "dynamic": "false",
        "properties": {
          "location": {
            "type": "geo_point"
          },
          "query": {
            "type": "percolator"
          },
          "status": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

And this query:

curl -X PUT "localhost:9200/queries/_doc/1" -d ' {
  "query": {
    "bool": {
      "filter": [{
          "term": {
            "status": "active"
          }
        },
        {
          "geo_distance": {
            "distance": "15km",
            "location": {
              "lat": 55.8899,
              "lon": 37.5926
            }
          }
        }
      ],
      "must": [],
      "must_not": []
    }
  }
}
'

Will {"term": {"status": "active"}} be extracted or I need to manually index it this way:

curl -X PUT "localhost:9200/queries/_doc/1" -d ' {
  "status": "active",
  "query": {
    "bool": {
      "filter": [
        {
          "geo_distance": {
            "distance": "15km",
            "location": {
              "lat": 55.8899,
              "lon": 37.5926
            }
          }
        }
      ],
      "must": [],
      "must_not": []
    }
  }
}
'

and modify my percolate query from this:

{
  "query": {
    "bool": {
      "filter": [
        {
          "percolate": {
            "field": "query",
            "document": {
              "title": "Some document title",
              "status": "active"
            }
          }
        }
      ]
    }
  }
}

to this:

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "status": "active"
          }
        },
        {
          "percolate": {
            "field": "query",
            "document": {
              "title": "Some document title",
              "status": "active"
            }
          }
        }
      ]
    }
  }
}

Or may be I need to move filter part of the bool query to its must part?

Massive thank you in advance!

After some IDE debugging I realized that terms are extracted in both bool query parts: must and filter. Extracted information is stored in the document (indexed query) field.

Not sure how this information is used in the percolate query.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.