Query missing nested objects along with other filter criteria

I have been working on migrating from elastic search 2.2 to 5.x and re-writing the queries.
There is one place where I have hit a road block and would really appreciate help.
Here is the scenario:
We have an index which contains people information like name, gender etc. It contains nested objects like addresses, as 1 person can have multiple address or none at all.
The query I am trying to create is to get all the people who have an address.state along with those who do not have an address at all.
I am able to get separately everyone who does not have the nested address by using “must_not” and exists:
{
"query": {
"bool":{
"must_not" :{"exists":{"field":"address.state"}}
}
}
}
And get people who have an address.State = WA or CA by a nested filter query
{
"query": {
"bool": {
"filter":[
{"nested":{
"path":" address ",
"query":{
"bool":
{
"filter":[
{"terms":{" address.state ":["WA","CA"]}}

                                                                                                                                            ] 
                                                                                                            }
                                                                                            }
                                                                            }
                                            } 
                                            ] 
}

}
}

But not able to join both into 1 query, to get all who do not have the nested address or have an address in WA/CA

{
"query": {

                            "bool":{
                                            "filter":[
                                                            {"nested":{"path":"test-degree",
                                                                            "query":{
                                                                                            "bool":
                                                                                                            {"filter":[
                                                                                                                            {"terms":{" address.state ":["WA","CA"]}}, 
                                                                                                                            
                                                                                                                            ]
                                                                                                            ,  "must_not" :{"exists":{"field":" address.state"}} 
                                                                                                            }
                                                                                                            }
                                                                                            }
                                                            }
                                                                            ]
                                            }

            }

}

There are several things going on here:

  • That first exists query is not a nested query. I'm sure it needs to be, if you want to query for address.state
  • Your bool query uses a filter. You can think of a filter like an "AND": all clauses inside the filter clause must match for a document to be returned as a hit. I think what you're trying to do is more like an "OR". In that case, use should instead of filter.
  • The queries in your bool query are not the same as your first two queries.
  • I think the nested path test-degree needs to be address

So, let's simplify. If your first query is:

{
  "query": {
    "nested": {
      "path": "address",
      "query": {
        "bool": {
          "must_not": {
            "exists": {
              "field": "address.state"
            }
          }
        }
      }
    }
  }
}

And the second (no need to wrap it in a filter):

{
  "query": {
    "nested": {
      "path": "address",
      "query": {
        "bool": {
          "filter": [
            {
              "terms": {
                "address.state": [
                  "WA",
                  "CA"
                ]
              }
            }
          ]
        }
      }
    }
  }
}

Then you can combine them in a bool like this:

{
  "query": {
    "bool": {
      "should": [
        {
          "nested": {
            "path": "address",
            "query": {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "address.state"
                  }
                }
              }
            }
          }
        },
        {
          "nested": {
            "path": "address",
            "query": {
              "bool": {
                "filter": [
                  {
                    "terms": {
                      "address.state": [
                        "WA",
                        "CA"
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

Thanks for clarifying some of the issues.
However if I change the first query to be a nested query with path. It returns zero results.
May be because the nested object itself is missing on the parent document.

If i execute your final query, it still gives me only documents where address.state in WA or CA and does not returns the records where address is not present.

Aha, I misunderstood and did not realize that your nested objects could be completely missing (I thought you wanted to check for a missing address.state). Add a clause to your query that checks for a missing address explicitly:

{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must_not": [
              {
                "nested": {
                  "path": "address",
                  "query": {
                    "exists": {
                      "field": "address"
                    }
                  }
                }
              }
            ]
          }
        },
        {
          "nested": {
            "path": "address",
            "query": {
              "bool": {
                "must_not": [
                  {
                    "exists": {
                      "field": "address.state"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "address",
            "query": {
              "bool": {
                "filter": [
                  {
                    "terms": {
                      "address.state": [
                        "WA",
                        "CA"
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.