Length filter: array index out of bounds exception

I encountered a strange issue, when a query fails depending on the position of a term and only then the length filter is active (see below):

  1. works:
GET test/_search?filter_path=**.productNumber
  "query": {
    "match": {
      "productNumber": {
        "query": "abc def ghij 3d"
  1. fails
GET test/_search?filter_path=**.productNumber
  "query": {
    "match": {
      "productNumber": {
        "query": "abc def 3d ghij"


          "caused_by" : {
            "type" : "array_index_out_of_bounds_exception",
            "reason" : "Index 0 out of bounds for length 0"

ES versions tested: 7.5, 7.6

Index settings:

PUT test
  "settings": {
    "number_of_shards": "1",
    "number_of_replicas": "0",
    "analysis": {
      "filter": {
        "length_min_2": {
          "type": "length",
          "min": 2
        "word_split_product_number": {
          "type": "word_delimiter_graph",
          "split_on_numerics": true,
          "generate_number_parts": true,
          "catenate_words": true,
          "catenate_numbers": true,
          "catenate_all": true,
          "preserve_original": true
      "analyzer": {
        "word_split_product_number_analyzer": {
          "filter": [
          "tokenizer": "standard"
  "mappings": {
    "properties": {
      "productNumber": {
        "type": "text",
        "analyzer": "word_split_product_number_analyzer"

Test docs:

PUT test/_bulk

I found the following solution as workaround, but it'd be great to have the length filter working too:
instead of:

        "length_min_2": {
          "type": "length",
          "min": 2

use those:

        "stop_empty": {
          "type": "stop",
          "stopwords": [ "" ]
        "pattern_length_min_2": {
          "type": "pattern_replace",
          "pattern": "^.$",
          "replacement": ""

      "analyzer": {
        "word_split_product_number_analyzer": {
          "filter": [
          "tokenizer": "whitespace"

The "workaround" above does not work today :frowning:
I'm getting the same error as above. It seems I made some mistakes during testing...

today's workaround: use the combination word_delimiter (not graph!) + flatten_graph
So, a bug in a graph token stream?

Any feedback from the ES engineers? Thanks!

I opened https://github.com/elastic/elasticsearch/issues/54434 - as the least thing that should happen is either a proper exception or a fix :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.