Multiple bool must clause on nested fields

Hi everyone,

I have a working query that I would like to discuss to see if there are ways to make it both smaller and more efficient.

I have a mapping that has a similar structure to this one:

{
   "documents": {
        "properties": {        
            "content": {
              "type": "string"
            },
            "id": {
               "type": "string",
               "index": "not_analyzed"
            },
            "user" : {
                "type" : "object",
                "properties" : {
                    "id" : {
                        "index" : "not_analyzed",
                        "type" : "string"
                    },
                    "fields": {
                        "include_in_parent" : true,
                        "type" : "nested",
                        "properties" : {
                            "key" : {
                                "type" : "string",
                                "index" : "not_analyzed"
                            },
                            "value" : {
                                "type" : "string",
                                "index" : "not_analyzed"
                            }
                        }
                    }
                }
            }
        }
    }
}

As you can see, user is nested and has itself a nested object:fields.
Both include_in_parent but I just added it here in case some solution can actually leverage it as for my current query I'm using nested.

So what I was trying to get was a document where it's user has both key: "name", value: "john" and key: "segment", value: "active".

My first approach was to use a nested bool:

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "id": "foo"
              }
            },
            {
              "nested": {
                "path": "user.fields",
                "query": {
                  "filtered": {
                    "filter": {
                      "bool": {
                        "must": [
                          {
                            "bool": {
                              "must": [
                                {
                                  "term": {
                                    "user.fields.key": "name"
                                  }
                                },
                                { 
                                  "term": {
                                    "user.fields.value": "john"
                                  }
                                }
                              ]
                            }
                          },
                          {
                            "bool": {
                              "must": [
                                {
                                  "term": {
                                    "user.fields.key": "segment"
                                  }
                                },
                                {
                                  "term": {
                                    "user.fields.value": "active"
                                  }
                                }
                              ]
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              }
            }
          ]
        }
      }
    }
  }
}

But it didn't work.
My hope was that it performed something like:

(user.fields.key:"name" AND user.fields.value:"john") AND (user.fields.key:"segment" AND user.fields.value:"string")

but since it does not work I assume it is actually matching something like:

user.fields.key:"name" AND user.fields.value:"john" AND user.fields.key:"segment" AND user.fields.value:"string"

Which would be impossible to match any document since there's only one key/value for each object on fields(hope this makes sense).

So currently we have this working query:

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "id": "foo"
              }
            },
            {
              "nested": {
                "path": "user.fields",
                "query": {
                  "filtered": {
                    "filter": {
                      "bool": {
                        "must": [
                          {
                            "term": {
                              "user.fields.key": "name"
                            }
                          },
                          {
                            "term": {
                               "user.fields.value": "john"
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              }
            },
            {
              "nested": {
                "path": "user.fields",
                "query": {
                  "filtered": {
                    "filter": {
                      "bool": {
                        "must": [
                          {
                            "term": {
                              "user.fields.key": "segment"
                            }
                          },
                          {
                            "term": {
                              "user.fields.value": "active"
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              }
            }
          ]
        }
      }
    }
  }
}

But we are concerned about the size of the query because:

  1. Our users will be able to query as much user.fields as they want so the query will grow indefinitely

  2. There's an overhead on the rest bit (not a huge problem, but still a point)

  3. is the first a query we write really doesn't feel right - looks like too much boiler plate and non-intuitive in a way.

Is there another approach to this problem that we are totally missing here?

(Sorry for the long post)

Thanks

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.