How to write the following Elastic Search into advanced script query?

{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "car_db",
        "_type": "_doc",
        "_id": "11",
        "_score": 1.0,
        "_source": {
          "searchResults": [
            {
              "beginTime": "2024-01-10T10:00:00+00:00",
              "nicknames": ["a_random_value"],
              "id": 6,
              "automobileId": 11,
              "finishedTime": "2024-01-20T10:00:00+00:00"
            }
          ]
        }
      },
      {
        "_index": "car_db",
        "_type": "_doc",
        "_id": "13",
        "_score": 1.0,
        "_source": {
          "searchResults": [
            {
              "beginTime": "2024-01-22T10:00:00+00:00",
              "nicknames": ["a_random_value"],
              "id": 8,
              "automobileId": 13,
              "finishedTime": "2024-01-25T10:00:00+00:00"
            }
          ]
        }
      }
    ]
  }
}

If I use:

GET /car_db/_search
{
  "_source": ["searchResults"],
  "query": {
    "exists": {
      "field": "searchResults"
    }
  },
  "size": 1000
}

I am able to get the list and content as expected. However I need to use scriptQuery because I am trying to do more advanced stuff. However once I use script query I can not get the content of the searchResults.
type or paste code here

How can I achieve the same response using something like:

GET /car_db/_search
{
  "_source": true,
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": """
              if (doc.containsKey('searchResults') && doc['searchResults'].size() > 0) {
                return true;
              } else {
                return false;
              }
            """
          }
        }
      }
    }
  }
}

Ofcourse once I get that to work I will modify it for more advanced operations.
type or paste code here

I would recommend using the query DSL to as great extent as possible as it is likely to be faster and scale better than a complex script query that does it all, which can be very slow and expensive. For the example you gave I would recommend using an exists query clause.

It would probably help if you could describe what you are looking to achieve and provide some details about your use case. What is it you are trying to achieve that query DSL can not support?

With the scriptQuery I am looping through a field which contains an array of objects. Inside the array of object I need to compare it's value to another field in the same record/row, and if it matches that other field in the same record, the record should be returned in the response(so I return as true).

However in my scriptQuery where I do a check with doc containskey, it does not find the key even though I use the same field name as listed by the mapping command:

GET /car_db/_mapping?include_type_name=true

I wonder if this has anything to do with the fact that this field in Query Workbench as appears as searchResults{1} and it is an array of object. It is not nested any deeper than other field that I am able to make the script query work with. Only difference like I said is that it is stored inside {}. Mapping shows the fields are stored under properties.

type or paste code here

Is searchResults mapped as a nested field? If it is, the nested documents are store as separate documents behind the scenes so will not be accessible as doc values on the main document. I have not run any scripted queries on nested documents, but it is possible that you may need to access the source field, which is likely to be a lot slower.

Can you please provide a full example and explain why you are not able to do this using a nested query?

Okay so this is all part of my boolQuery filtering that I am implementing with PHP.

The es index stores data in a field, under searchResults, in that searchResults { locationId, .....}(there is a lot more data in there), and that locationId needs to be compared with data that is sent from the endpoint request e.g. $data['location_id']. If they match, than it means that record i.e. searchResults which is just a field in the row for an index of automobiles needs to be returned together with all the data in the row e.g. automobileId.

protected function getAutomobiles(array $data): BoolQuery
    {
        $boolQuery = new BoolQuery();

        if (isset($data['location_id'])) {
            $script = <<<EOD
                def locationIds = params.locationId ?: [];

                if (doc.containsKey('searchResults')) {
                    for (result in doc['searchResults']) {

                        if (result.containsKey('scope') && result['scope'].size() > 0) {
                            def areaLocationId = doc.containsKey('areaLocationId') ? doc['areaLocationId'].value : null;
                            def regionLocationId = doc.containsKey('regionLocationId') ? doc['regionLocationId'].value : null;
                            
                            for (locationId in locationIds) {
                                if (result['scope'].contains('area') 
                                    && areaLocationId != null && areaLocationId == locationId) {
                                    return true;
                                } else if (result['scope'].contains('region') 
                                    && regionLocationId != null && regionLocationId == locationId) {
                                    return true;
                                }
                            }
                        }
                    }
                }
            EOD;

            $scriptQuery = new ScriptQuery($script);

            $scriptQuery->addParameter(
                'params',
                [
                    'locationId' => $data['location_id'],
                ]
            );

            $boolQuery->add($scriptQuery, BoolQuery::FILTER);
        }

        return $boolQuery;
    }

type or paste code here

Please provide one or more sample documents (in JSON) and explain how you logically query for these. I am not a PHP developer and do not have time to try to interpret the logic from your script.

Thanks for your reply.

Sure, here is the document in json:

{
  "took": 398,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.0,
    "hits": [
      {
        "_index": "automobiles_db",
        "_type": "_doc",
        "_id": "14",
        "_score": 0.0,
        "_source": {
          "carBrand": "Toyota",
          "carEngine": "V6",
          "carPrice": 25000,
          "searchResults": [
            {
              "beginTime": "2024-02-22T10:00:00+00:00",
              "locationIds": [
                "23423"
              ],
              "id": 9,
              "finishedTime": "2024-02-25T10:00:00+00:00"
            },
            {
              "beginTime": "2024-02-27T10:00:00+00:00",
              "locationIds": [
                "324234"
              ],
              "id": 10,
              "finishedTime": "2024-02-28T10:00:00+00:00"
            }
          ]
        }
      }
    ]
  }
}


Mapping:

{
  "automobiles_db": {
    "mappings": {
      "properties": {
        "searchResults": {
          "properties": {
            "locationId": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "finishedTime": {
              "type": "date"
            },
            "id": {
              "type": "long"
            },
            "beginTime": {
              "type": "date"
            }
          }
        }
      }
    }
  }
}

So sum it up, the issue is, the field is stored as an array of objects, and the doc.containsKey('searchResults') seems to not exist. But it is there, I just can't access it. How?

You are not using nested mappings, so Elasticsearch will index the values under the fields as arrays.

"searchResults": [
            {
              "beginTime": "2024-02-22T10:00:00+00:00",
              "locationIds": [
                "23423"
              ],
              "id": 9,
              "finishedTime": "2024-02-25T10:00:00+00:00"
            },
            {
              "beginTime": "2024-02-27T10:00:00+00:00",
              "locationIds": [
                "324234"
              ],
              "id": 10,
              "finishedTime": "2024-02-28T10:00:00+00:00"
            }
          ]

will basically be indexed like this (simplified):

"searchResults.beginTime": ["2024-02-22T10:00:00+00:00", "2024-02-27T10:00:00+00:00"],
"searchResults.locationIds": ["23423", "324234"],
"searchResults.id": [9, 10],
"searchResults.finishedTime": ["2024-02-25T10:00:00+00:00", "2024-02-28T10:00:00+00:00"]

You should therefore be able to check if the document contains searchResults.id or one of the others, but as searchResults does not directly contain any terms this does not show up as a doc value.

If you need access to the full structure you need to access _source, but be aware that can be very slow and scale badly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.