Filter results by the length of the nested objects returned, similar to min_child

I'm trying to filter my results based on the number of nested objects found for a document.
I've tried using scores but I cant get scores to work with filters, and I've tried a script but can't get that working either.

Below are more details on all of it. Basically I need to:

  • Filter the parent documents by their properties
  • Filter each nested object by a few properties
  • Finally, filter the parent document by the number of nested documents remaining to a single fixed number

These are my mappings:

    "mappings": {
      "properties": {
        "dates" : {
          "type" : "nested",
          "properties" : {
            "rooms" : {
              "type" : "integer"
            },
            "timestamp" : {
              "type" : "long"
            }
          }
        },
        "doc_id" : {
          "type" : "text"
        },
        "distance" : {
          "type" : "integer"
        }
        ...
      }
    }

Example data

    PUT /test/_doc/1
    {
      "doc_id": "1",
      "distance": 1,
      "dates": [
        {
          "rooms": 1,
          "timestamp": 1
        },
        {
          "rooms": 1,
          "timestamp": 2
        },
        ...
      ]
    }

There are unknown number of nested dates, but won't be more than ~15 per document. I'm filtering by parent and nested child values.

My attempt using a script:

    GET /test/_search
    {
      "query": {
        "bool" : {
          "filter": [
            { "range": { "distance": { "lt": 5 }}},
            {
              "nested": {
                "path": "dates",
                "query": {
                  "bool": {
                    "filter": [
                      { "range": { "dates.rooms": { "gte": 1 } } },
                      { "range": { "dates.timestamp": { "lte": 2 }}},
                      { "range": { "dates.timestamp": { "gte": 1 }}}
                    ]
                  }
                }
              }
            },
            {
              "script": {
                "script": {
                  "source": "doc['dates'].length = 2",
                  "lang": "painless"
                }
              }
            }
          ]
        }
      }
    }

This throws an error: No field found for [dates] in mapping with types []

So I've tried to follow this solution here

Attempt using scoring

    GET /test/_search
    {
      "query" : {
        "function_score": {
          "min_score": 20,
          "boost": 1,
          "functions": [
            {
              "script_score": {
                "script": {
                    "source": "if (_score > 20) { return - 1; } return _score;"
                }
              }
            }
          ],
          "query": {
            "bool" : {
              "filter": [
                { "range": { "distance": { "lt": 5 }}},
                {
                  "nested": {
                    "score_mode": "sum",
                    "boost": 10,
                    "path": "dates",
                    "query": {
                      "bool": {
                        "filter": [
                          { "range": { "dates.rooms": { "gte": 1 } } },
                          { "range": { "dates.timestamp": { "lte": 2 }}},
                          { "range": { "dates.timestamp": { "gte": 1 }}}
                        ]
                      }
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }

But Scoring doesnt work with filters and I can't seem to get this working at all.
Would really appreciate any help on this topic.

Thanks!

Hey,

how about storing the size of the array in an extra field when indexing, so that your query for the size of the array becomes a range query and will be much much faster than the script one?

--Alex

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.