Script Returning Unexpected Result

Hello Elastic Community,

I am beginning to write some server side scripts, and was running into a few issues. I think it stems from my fundamental understanding of the data structures and objects that are handled in the script.

The goal of the script is given a set of values check a field in the document (which should have multiples values or an array) and return true if the values of the field is composed only of the given set of values.

i.e.:
given: ['A', 'B'] and comparing against "myField"

with doc1:

{
   "myField": ['A', 'B', 'A']
}

doc2:

{
   "myField": ['A', 'B', 'C']
}

doc3:

{
   "myOtherField": "something I dont care about"
}

I am hoping to return doc1 and not doc2 or doc3.

This is what I have so far:

doc['myField.keyword'].size() != 0 && doc['myField.keyword']
              .stream()
              .allMatch(item -> item.equals("A") || item.equals("B"));

This script has been returning documents with any values for 'myfield' and seems to not honor the second clause of the boolean expression :frowning: am curious what I may be missing here.

Thank you for taking the time and offering your help, as it tremendously helps further my learning in the Elasticsearch ecosystem!

CJ

Hi @cj_hillbrand

This question looks like yours, maybe the solution can help you.

Thanks for the link,
it seems the problem in there is slightly different, but the same logic can be translated. Since the field that is being queried does not live in a nested object, do you know what changes would be necessary to apply the same effect?

Here is what I have so far, although it is still returning results with a mix of values requested/not requested:

{
  "query": {
    "bool": {
      "must_not": [
        {
          "bool": {
            "must_not": [
              {
                "match": {
                  "attributes.ssdFirmware.keyword": {
                    "query": "1.0 "
                  }
                }
              },
              {
                "match": {
                  "attributes.ssdFirmware.keyword": {
                    "query": "EDA7BM5Q"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

Also I should make a correction to my last comment, the field lives in a nested object, but the array that is being queried is not a collection of complex objects.

Hi @cj_hillbrand,
Your script works. Keep in mind values are de-duplicated when stored in doc-values, so doc['myField.keyword'].size() for doc 1 is 2.

If you can provide a reproduction via the Dev Console, it'd help debug.

PUT _bulk
{ "index" : { "_index" : "allmatch", "_id" : "1" } }
{ "myField": ["A", "B", "A"] }
{ "index" : { "_index" : "allmatch", "_id" : "2" } }
{ "myField": ["A", "B", "C"] }
{ "index" : { "_index" : "allmatch", "_id" : "3" } }
{ "myOtherField": "something I dont care about" }


GET allmatch/_search
{
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": """
            doc['myField.keyword'].size() != 0 && doc['myField.keyword']
              .stream()
              .allMatch(item -> item.equals("A") || item.equals("B"));
            """
          }
        }
      }
    }
  }
}

Returns

{
  "took" : 20,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "allmatch",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "myField" : [
            "A",
            "B",
            "A"
          ]
        }
      }
    ]
  }
}

Thanks for testing that out, let me post the real-world scenario, as there may be some shortcoming introduced when moving from the simple case to the troublesome particular case.

Here is an example of the query being used:

{
  "query": {
    "bool": {
      "filter": [
        {
          "script": {
            "script": """
              doc['attributes.ssdFirmware.keyword'].size() != 0 && doc['attributes.ssdFirmware.keyword']
              .stream()
              .allMatch(item -> item.equals("1.0 ") || item.equals("EDA7BM5Q"))
            """
          }
        }
      ]
    }
  }
}

and here is a (truncated) document that is returned from the search:

{
        "_index" : "node_availability-54070522072022",
        "_type" : "_doc",
        "_id" : "PlbhJoIBxAX7ZL_CWwLe",
        "_score" : 1.0,
        "_source" : {
          "attributes" : {
            "ssdFirmware" : [
              "80430E00",
              "20036P00",
              "1.0 "
            ]
          }
        }

Thanks again for taking the time to help.

Hi @cj_hillbrand,
I'm still getting the correct result.

I find it useful to reproduce the issue in the dev console, could you post a reproduction that's easy to copy and paste into the dev console rather than snippets?

I suspect creating getting the minimal dev console reproduction will get you far along toward solving the issue.

PUT _bulk
{"index":{"_index":"unexpected","_id":"1"}}
{"attributes":{"ssdFirmware":["80430E00","20036P00","1.0 "]}}

GET unexpected/_search
{
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": """
             doc['attributes.ssdFirmware.keyword'].size() != 0 && doc['attributes.ssdFirmware.keyword']
              .stream()
              .allMatch(item -> item.equals("1.0 ") || item.equals("EDA7BM5Q"))
            """
          }
        }
      }
    }
  }
}

Returns

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

As expected, the following matches.

GET unexpected/_search
{
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": """
             doc['attributes.ssdFirmware.keyword'].size() != 0 && doc['attributes.ssdFirmware.keyword']
              .stream()
              .allMatch(item -> item.equals("1.0 ") || item.equals("80430E00") || item.equals("20036P00"))
            """
          }
        }
      }
    }
  }
}

Hey Stu,

Thanks for your patience, and recommendation on posting something to copy + paste into dev console rather than snippets. Ill be sure to carry that with me in future posts.

I went ahead and tried the example that you had outlined, and things worked as expected. I even tried the old example that I had originally posted and it works as well.

Stu, to be frank, I think I had a line in between the GET my_index/_search and the json query.
Ill go ahead and mark your response as an answer, effectively closing this thread.

Thank you again for the time, recommendations and patience!
CJ

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.