How to retrieve the length of an array in a nested document using painless

My question is, how do I get the length of an array field in a nested document using a painless script.

Here is the problem I am trying to solve. We have documents which can have a set of labels assigned to them. These labels are updated by users. Users need to be able to filter documents by the existence of these labels. One use case is to find documents which have a specific set of labels and no other labels.

Example:

A user filters for documents containing foo, bar, and baz.

A document which contains ["foo", "bar", "baz"] should match.
A document which contains ["foo", "bar", "baz", "qux"] should not match.
A document which contains ["foo", "bar"] should not match.

My research indicates the the best way to accomplish this is to write a query which finds the documents which have the labels I want AND to find the documents which also have the same number of labels as I am searching for. It is often suggested that you should index an additional field which has the count to make this search fast.

However, my documents are updated frequently, and from various sources, so ensuring this extra field is updated correctly whenever the array changes is complicated. We may eventually do that, but right now I am hoping a script query will handle things for us.

Here is my mappings:

{
  "mappings": {
    "_doc": {
      "properties": {
        "metadata": {
          "properties": {
            "labels": {
              "type": "nested",
              "properties": {
                "name": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    }
  }
}

The actual documents are much larger and there are many other sections similar to the metadata section above.

An example document for the above mapping would look like this:

{
  "metadata": {
    "labels": [
      { "name": "foo" },
      { "name": "bar" },
      { "name": "baz" }
    ]
  }
}

Here is my attempt at a query to find this document:

{
  "query": {
    "bool": {
      "must": [ { "bool": { "adjust_pure_negative": true, "boost": 1 } } ],
      "filter": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "must": [
                    {
                      "nested": {
                        "query": {
                          "terms": {
                            "metadata.labels.name": [
                              "Another?"
                            ],
                            "boost": 1
                          }
                        },
                        "path": "metadata.labels",
                        "ignore_unmapped": false,
                        "score_mode": "none",
                        "boost": 1
                      }
                    },
                    {
                      "nested": {
                        "query": {
                          "terms": {
                            "metadata.labels.name": [
                              "another label"
                            ],
                            "boost": 1
                          }
                        },
                        "path": "metadata.labels",
                        "ignore_unmapped": false,
                        "score_mode": "none",
                        "boost": 1
                      }
                    },
                    {
                      "nested": {
                        "query": {
                          "script": {
                            "script": {
                              "source": "if (doc['metadata'] == null || doc['metadata.labels'] == null) {  false;} else {  doc['metadata.labels'].length == params.termCount;}",
                              "lang": "painless",
                              "params": {
                                "termCount": 2
                              }
                            },
                            "boost": 1
                          }
                        },
                        "path": "metadata.labels",
                        "ignore_unmapped": true,
                        "score_mode": "none",
                        "boost": 1
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}


{
  "reason": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:81)",
      "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:39)",
      "if (doc['metadata'] == null || doc['metadata.labels'] == null) {  ",
      "        ^---- HERE"
    ],
    "script": "if (doc['metadata'] == null || doc['metadata.labels'] == null) {  false;} else {  doc['metadata.labels'].length == params.termCount;}",
    "lang": "painless",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "No field found for [metadata] in mapping with types [_doc]"
    }
  }
}

This results in the following error:

{
  "reason": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:81)",
      "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:39)",
      "if (doc['metadata'] == null || doc['metadata.labels'] == null) {  ",
      "        ^---- HERE"
    ],
    "script": "if (doc['metadata'] == null || doc['metadata.labels'] == null) {  false;} else {  doc['metadata.labels'].length == params.termCount;}",
    "lang": "painless",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "No field found for [metadata] in mapping with types [_doc]"
    }
  }
}

I have also tried to replace the script with this:

{
  "nested": {
    "query": {
      "script": {
        "script": {
          "source": "if (params._source.metadata == null || params._source.metadata.labels == null) {  false;} else {  params._source.metadata.labels.length == params.termCount;}",
          "lang": "painless",
          "params": {
            "termCount": 2
          }
        },
        "boost": 1
      }
    },
    "path": "metadata.labels",
    "ignore_unmapped": true,
    "score_mode": "none",
    "boost": 1
  }
}

But this results in this error:

{
  "reason": {
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "if (params._source.metadata == null || params._source.metadata.labels == null) {  ",
      "                  ^---- HERE"
    ],
    "script": "if (params._source.metadata == null || params._source.metadata.labels == null) {  false;} else {  params._source.metadata.labels.length == params.termCount;}",
    "lang": "painless",
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": null
    }
  }
}

How do I correctly access the length of the array field within a nested document?

Thanks,
Quincy

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.