Cannot access doc values after forcemerging

Hi, I am working on a scripted engine which needs to access doc values to do some calculations.

The code it uses to access the doc value below:

ScriptDocValues.Longs docValues = (ScriptDocValues.Longs) lookup.doc().getLeafDocLookup(context).get(length_field);

  • length_field is the name of a field (and is a parameter to the script)
  • lookup is of type org.elasticsearch.search.lookup.SearchLookup (it is provided in the scripted engine)
  • context is of type org.elasticsearch.script.ScriptContext (it is also provided in the scripted engine)

When I create an index, and put a few documents in that index, the scripted engine can execute just fine.

However, after forcemerging and setting max_num_segments to 1, running the scripted engine results in an index_out_of_bounds exception from Elasticsearch. Interestingly, even when I get an exception, I can see that the key is still present in the doc values because I tried logging the output of the containsKey method:

lookup.doc().getLeafDocLookup(context).containsKey(length_field)

and it prints true.

Can somebody please help me with this?

(Note I think this question is asking the same thing: Access doc fields value in script-expert-scoring)

Can you show the full stacktrace? Can you reproduce it using painless instead of your search engine? Which version of elasticsearch are you using?

Hi Igor, thanks for the reply.

I actually managed to get it working by using the Lucene docValue methods directly. In case it helps anyone else, the code I used is below:

SortedNumericDocValues iterator = context.reader().getSortedNumericDocValues(length_field);
if (!iterator.advanceExact(currentDocid)){
    throw new IOException("Cannot read length of document");
}
long bodyLength = iterator.nextValue();

The Lucene methods used above are documented here: https://lucene.apache.org/core/7_5_0/core/org/apache/lucene/index/SortedNumericDocValues.html