Percolating returns more results that i expect

Smashing my head against the wall with this, wondering if anyone can point anything out thats obvious.

Using ES 5.6 (I know its out of date, im in the process of upgrading)

I have an advert in ES
GET /gb/classified/155597007

{
    "_index": "gb-1",
    "_type": "classified",
    "_id": "155597007",
    "_version": 1,
    "found": true,
    "_source": {
        ...
        "product": {
            "age": "19",
        },
        "categories": [
            {
                "meta_code": "missed-connections"
            }
        ]
    }
}

Then i have some email alerts
Alert 1

  • MUST categories.meta_code = missed-connections
  • MUST product.age = 19
  • MUST_NOT exist = foreign_ids.kiwii_mysql.partner_id
    GET /gb/email_alerts/15531
{
    "_index": "gb-1",
    "_type": "email_alerts",
    "_id": "15531",
    "_version": 1,
    "found": true,
    "_source": {
        .....
        "type": "classified",
        "query": {
            "bool": {
                "must": {
                    "match_all": {}
                },
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "categories.meta_code": "missed-connections"
                                }
                            },
                            {
                                "range": {
                                    "product.age": {
                                        "gte": 19,
                                        "lte": 19
                                    }
                                }
                            }
                        ],
                        "must_not": [
                            {
                                "exists": {
                                    "field": "foreign_ids.kiwii_mysql.partner_id"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}

Alert 2

  • MUST categories.meta_code = missed-connections
  • MUST_NOT exist = foreign_ids.kiwii_mysql.partner_id
    GET /gb/email_alerts/15534
{
    "_index": "gb-1",
    "_type": "email_alerts",
    "_id": "15534",
    "_version": 1,
    "found": true,
    "_source": {
        .....
        "type": "classified",
        "query": {
            "bool": {
                "must": {
                    "match_all": {}
                },
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "categories.meta_code": "missed-connections"
                                }
                            }
                        ],
                        "must_not": [
                            {
                                "exists": {
                                    "field": "foreign_ids.kiwii_mysql.partner_id"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}

The email_alerts documents are of type percolate. When i search them looking for classified documents that have both

  • categories.meta_code = missed-connections
  • product.age = 19

GET /gb/email_alerts/_search
query:

{
    "explain": true,
    "query": {
        "percolate": {
            "field": "query",
            "document_type": "classified",
            "document": {
                "categories": [
                    { "meta_code": "missed-connections" }
                ],
                "product": {
                    "age": 19
                }
            }
        }
    }
}

I am expecting to see just one result. The result that only has both of these. Instead, 2 are returned. What am i missing

{
    "took": 10,
    "timed_out": false,
    "_shards": {
        "total": 4,
        "successful": 4,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 1.0,
        "hits": [
            {
                "_shard": "[gb-1][0]",
                "_node": "RTS98o-US_K0v0StGxHpUQ",
                "_index": "gb-1",
                "_type": "email_alerts",
                "_id": "15531",
                "_score": 1.0,
                "_source": {
                    ....
                    "type": "classified",
                    "query": {
                        "bool": {
                            "must": {
                                "match_all": {}
                            },
                            "filter": {
                                "bool": {
                                    "must": [
                                        {
                                            "term": {
                                                "categories.meta_code": "missed-connections"
                                            }
                                        },
                                        {
                                            "range": {
                                                "product.age": {
                                                    "gte": 19,
                                                    "lte": 19
                                                }
                                            }
                                        }
                                    ],
                                    "must_not": [
                                        {
                                            "exists": {
                                                "field": "foreign_ids.kiwii_mysql.partner_id"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    },
                    "frequency": {
                        "push": true
                    },
                    "category_meta_code": "missed-connections",
                    "location_id": "0",
                    "search_geo_radius": false,
                    "kiwii_criteria": "{\"geo_radial_distance\":\"0\",\"sp_personals_age\":{\"start\":\"19\",\"end\":\"19\"},\"category_meta\":\"missed-connections\",\"searchGeoId\":\"0\"}"
                },
                "_explanation": {
                    "value": 1.0,
                    "description": "sum of:",
                    "details": [
                        {
                            "value": 1.0,
                            "description": "PercolateQuery",
                            "details": [
                                {
                                    "value": 1.0,
                                    "description": "ConstantScore(+categories.meta_code:missed-connections +product.age:[19 TO 19] -ConstantScore(_field_names:foreign_ids.kiwii_mysql.partner_id)), product of:",
                                    "details": [
                                        {
                                            "value": 1.0,
                                            "description": "boost",
                                            "details": []
                                        },
                                        {
                                            "value": 1.0,
                                            "description": "queryNorm",
                                            "details": []
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "value": 0.0,
                            "description": "match on required clause, product of:",
                            "details": [
                                {
                                    "value": 0.0,
                                    "description": "# clause",
                                    "details": []
                                },
                                {
                                    "value": 1.0,
                                    "description": "_type:email_alerts, product of:",
                                    "details": [
                                        {
                                            "value": 1.0,
                                            "description": "boost",
                                            "details": []
                                        },
                                        {
                                            "value": 1.0,
                                            "description": "queryNorm",
                                            "details": []
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            },
            {
                "_shard": "[gb-1][0]",
                "_node": "RTS98o-US_K0v0StGxHpUQ",
                "_index": "gb-1",
                "_type": "email_alerts",
                "_id": "15534",
                "_score": 1.0,
                "_source": {
                    ....
                    "type": "classified",
                    "query": {
                        "bool": {
                            "must": {
                                "match_all": {}
                            },
                            "filter": {
                                "bool": {
                                    "must": [
                                        {
                                            "term": {
                                                "categories.meta_code": "missed-connections"
                                            }
                                        }
                                    ],
                                    "must_not": [
                                        {
                                            "exists": {
                                                "field": "foreign_ids.kiwii_mysql.partner_id"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    },
                    "frequency": {
                        "push": true
                    },
                    "category_meta_code": "missed-connections",
                    "location_id": "0",
                    "search_geo_radius": false,
                    "kiwii_criteria": "{\"geo_radial_distance\":\"0\",\"category_meta\":\"missed-connections\",\"searchGeoId\":\"0\"}"
                },
                "_explanation": {
                    "value": 1.0,
                    "description": "sum of:",
                    "details": [
                        {
                            "value": 1.0,
                            "description": "PercolateQuery",
                            "details": [
                                {
                                    "value": 1.0,
                                    "description": "ConstantScore(+categories.meta_code:missed-connections -ConstantScore(_field_names:foreign_ids.kiwii_mysql.partner_id)), product of:",
                                    "details": [
                                        {
                                            "value": 1.0,
                                            "description": "boost",
                                            "details": []
                                        },
                                        {
                                            "value": 1.0,
                                            "description": "queryNorm",
                                            "details": []
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "value": 0.0,
                            "description": "match on required clause, product of:",
                            "details": [
                                {
                                    "value": 0.0,
                                    "description": "# clause",
                                    "details": []
                                },
                                {
                                    "value": 1.0,
                                    "description": "_type:email_alerts, product of:",
                                    "details": [
                                        {
                                            "value": 1.0,
                                            "description": "boost",
                                            "details": []
                                        },
                                        {
                                            "value": 1.0,
                                            "description": "queryNorm",
                                            "details": []
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            }
        ]
    }
}

Have not used pervolate nor your version in a lang time but will have a go anyway.

The document you are sending in seems to be matching both percolate queries. The only difference between them is that one requires product.age to be 19 (which it is) while the other one does not look at that field at all (naturally also matches). Which one would you exect to not match and why is that?

Thanks for the reply. Then I assume it my missunderstanding of what should be returned / how this works. Since I am including both in the query

"categories": [
    { "meta_code": "missed-connections" }
],
"product": {
    "age": 19
}

I was expecting / hoping that this would only return the results which match both these:

  • categories.meta_code = missed-connections
    AND
  • product.age = 19

I am guessing thats not correct? And if so, do you know a way to do this?

You have 2 queries stored, /gb/email_alerts/15531 and /gb/email_alerts/15534, and both these match.

The other query also matches as it has exactly the same conditions apart from product.age, which is left out. All queries that match the document will be returned.

:man_facepalming:

It took me re-reading this whole thing a few times to realise im thinking about the query as a filter, rather than the query on the alert that does the filtering.

Thanks for putting it in plain words and replying.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.