IndexOutOfBoundsException after update from 6.2 to 6.4


#1

Hi,

I have a big multi search query which contains pretty complicated aggregations. Recently I updated the Elasticsearch version from 6.2 to 6.4 and when this query is executed using the official PHP package, I am getting an error message in my log file which it looks like this:

[2018-10-16T13:02:57,198][DEBUG][o.e.a.s.TransportSearchAction] [-wDq1Cy] [products][1], node[-wDq1CyCS9-E3etx9dftqg], [P], s[STARTED], a[id=RZQYSUEcQBuWKI-4VxdXZA]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, $
org.elasticsearch.transport.RemoteTransportException: [-wDq1Cy][127.0.0.1:9300][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException: Query Failed [Failed to execute main query]
        at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:298) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:107) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesService.lambda$loadIntoContext$17(IndicesService.java:1184) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesService.lambda$cacheShardLevelResult$18(IndicesService.java:1237) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:160) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesRequestCache$Loader.load(IndicesRequestCache.java:143) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:433) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:116) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1243) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1183) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:322) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:357) ~[elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:333) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:329) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:1019) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.4.2.jar:6.4.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.2.jar:6.4.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.lang.IndexOutOfBoundsException
        at java.nio.Buffer.checkIndex(Buffer.java:540) ~[?:1.8.0_171]
        at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:253) ~[?:1.8.0_171]

It is important to mention that this is not the case when I try to execute this query via Kibana. When executed through Kibana I receive correct results. Previously, on 6.2 version, this was not an issue at all.

I noticed that the problem appears at the last two top_hits aggregations. If I remove them, I receive correct results.

I had 1 node and 2 shards on my local machine when the problem actually appeared. When I increased the number of shards to 3 or 5, the problem disappeared.

I checked the the two indexes I am querying in my Kibana monitoring section and this is the strange thing I have noticed:

    name: index1
    status: Health: green Green
    document count: 550
    data: 211.4 KB
    index rate: 0 /s
    search rate: 0.01 /s
    unassigned shards: 0

It is strange that I have only 28 indexed documents in index1 for real, so I am confused how this numbers are counted and what they actually represent?

I have a workaround for this problem with increasing the number of shards. But it's strange to allocate 5 shards because of two indexes with 28 and 112 documents stored in them in order to make those aggregations working. Am I doing something wrong?


#2

BTW, this is one of the queries executed in the multi search query:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "search_tags": {
              "query": "t-sh",
              "analyzer": "standard",
              "operator": "and"
            }
          }
        }
      ],
      "must_not": []
    }
  },
  "aggs": {
    "templates_matched": {
      "nested": {
        "path": "template"
      },
      "aggs": {
        "templates": {
          "filter": {
            "match": {
              "template.name": {
                "query": "t-sh",
                "analyzer": "standard"
              }
            }
          },
          "aggs": {
            "template_id": {
              "terms": {
                "field": "template.id",
                "size": 10
              },
              "aggs": {
                "full_template": {
                  "top_hits": {
                    "size": 1,
                    "_source": {
                      "includes": "template.name"
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "options_matched": {
      "nested": {
        "path": "template"
      },
      "aggs": {
        "template_id": {
          "terms": {
            "field": "template.id",
            "size": 10
          },
          "aggs": {
            "full_template": {
              "top_hits": {
                "size": 1,
                "_source": {
                  "includes": "template"
                }
              }
            },
            "option_groups": {
              "nested": {
                "path": "option_groups"
              },
              "aggs": {
                "option_groups": {
                  "filter": {
                    "nested": {
                      "path": "option_groups.option",
                      "query": {
                        "match": {
                          "option_groups.name": {
                            "query": "t-sh",
                            "analyzer": "standard"
                          }
                        }
                      }
                    }
                  },
                  "aggs": {
                    "option_group_id": {
                      "terms": {
                        "field": "option_groups.id",
                        "size": 10
                      },
                      "aggs": {
                        "option_group_name": {
                          "top_hits": {
                            "size": 1,
                            "_source": {
                              "includes": "option_groups.name"
                            }
                          }
                        },
                        "option": {
                          "nested": {
                            "path": "option_grouos.option"
                          },
                          "aggs": {
                            "option": {
                              "filter": {
                                "match": {
                                  "option_groups.option.name": {
                                    "query": "t-sh",
                                    "analyzer": "standard"
                                  }
                                }
                              },
                              "aggs": {
                                "option_id": {
                                    "terms": {
                                      "field": "option_groups.option.id",
                                      "size": 10
                                    },
                                    "aggs": {
                                      "full_option": {
                                        "top_hits": {
                                          "size": 1,
                                          "_source": {
                                            "includes": "option_groups.option"
                                          }
                                        }
                                      }
                                    }
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }

Hopefully, someone can help me. Thanks! :slight_smile:


(Alexander Reelsen) #3

Is this happening consistently for all of your queries or only happening every now and then?

Also do you have a full stack trace available?


#4

Hi Alexander, thanks for responding.

I have 4 queries in my multi search query, from which only 2 of them are failing (those are the similar ones - one of those two is pasted above). It happens every time when the "option group" / "option" top hits aggregation need to be done.

This should be the full stack trace from today:

stack trace

The stack trace is to large to be pasted here so I need to do it like this. Please be aware that this is the stack trace from my local environment.


(Alexander Reelsen) #5

I think this warrants opening an issue in the Elasticsearch repository, that NullPointerException looks weird to me.

Can you go ahead and file the issue and attach that stack trace to it?
Many thanks!

--Alex


#6

Right away! :slight_smile: Thank you for your response.

Dragan


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.