Using the LENGTH string function to limit search

Hello everyone,

I use the 'query_string' command to count members, where I want to limit the results to only those documents where the length in the 'membIdentificNumb' field is exactly 7 characters.
I entered the following command in Kibana:

GET transakcije_2y/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": {
        "query_string": {
          "query": "date:[2019-01-01 TO 2019-12-31] AND (type: (41) OR type:(47)) AND acronym: (CTK) **AND (LENGTH(membIdentificNumb)==7)**",
          "time_zone": "Europe/Ljubljana"
        }
      }
    }
  },
  "aggs": {
    "by_membIdentificNumb": {
      "terms": {
        "field": "membIdentificNumb",
        "size": 100,
        "order": {
          "_key": "asc"
        }
      }
    }
  }
}

and got a reply:

{
  "took" : 1679,
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 525,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "by_membIdentificNumb" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "**0004798**",
          "doc_count" : 14
        },
        {
          "key" : "**990000006**",
          "doc_count" : 13
        },
...

Why does query not respect the result limits on members with a length of 7? It should find only the first, which has a length of 7, and the others eliminated. The field 'membIdentificNumb' is full text.

It looks like the LENGTH function does not affect the result. If e.g. changing to (LENGTH (membIdentificNumb) == 1) returns the same result.

Thanks for any help
Tinko

Hello again

Please ignore the ** sign in the case as I wanted to label Bold but it didn't work ... :frowning_face:

Tinko

wondering, where did you read about the LENGTH function, as I am not aware such a thing exists.

I think a good way would be to store the length of a field at index time in another field and then filter on that. You could do that using an ingest pipeline with a script processor

Hi, Alexander

thanks for the help and suggestion.

I found the length function at https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-string.html#sql-functions-string-length

I also tried to using script for the same example,

GET transakcije_2y/_search
{
  "size": 0,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "date": {
                  "gte": "01/01/2019",
                  "lte": "31/12/2019",
                  "format": "dd/MM/yyyy||yyyy",
                  "time_zone": "Europe/Ljubljana"
                }
              }
            },
            {
         {
              "match": {
                "acronym": "CTK"
              }
            },
            {
              "script": {
                "script": "doc['membIdentificNumb'].length == 7"
              }
            }
          ],
          "must_not": [
            {
              "match": {
                "patronCategory": "126"
              }
            }
          ],
          "should": [
            {
              "match": {
                "type": "41"
              }
            },
            {
              "match": {
                "type": "47"
              }
            }
          ],
          "minimum_should_match": 1
        }
      }
    }
  },

but the results are the same ...

Best regards,
Tinko

the length function is part of the SQL API, but not of the query string query.

While you could use a script query, this will be much slower than following my above suggestions, as you many more calculations at query time than at index time - but works in principle.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.