Native Sort Pugin / Delimited Payload Token Filter throws "Cannot itererate twice"

mrkamel · October 5, 2017, 1:11pm

Hi,

i've written a native (java) sort plugin (extending AbstractDoubleSearchScript) which uses data derived from a delimited payload token filter, such that the plugin uses custom scores indexed via the payload token filter "keyword1|0.812 keyword2|0.741 ..." to sort the documents in accordance to these custom scores. The plugin iterates over a list of keywords (passed as script params) and calculates a sum for sorting.

  private double runAsDouble() {
    double res = 0.0;
    int n = 0;

    for(String keyword : keywords) {
      try {
        for(TermPosition termPosition : indexLookup().get("term_scores").get(keyword, IndexLookup.FLAG_PAYLOADS)) {
          res += termPosition.payloadAsFloat(0.0f);
          
          n++;
        }
      } catch(ElasticsearchException e) {
        e.printStackTrace();

        return 0.0;
      }
    }   

    if(n == 0)
      return 0.0;

    return res / (double)n;
  }

This however throws tons of "org.elasticsearch.ElasticsearchException: Cannot iterate twice! If you want to iterate more that once, add _CACHE explicitly." even if i pass only a single keyword to iterate over. I'd like to know why this code gets called multiple times? Is it like ES is executing runAsDouble on every comparison during it's internal sort process? Adding FLAG_CACHE works, but i still like to know why the code is called multiple times.

Interestingly, the code used in this stackoverflow answer https://stackoverflow.com/a/21481792 doesn't use _CACHE, but maybe this is a matter of ES versioning or inline scripts adding _CACHE automatically ...

Thanks in advance

rjernst · October 12, 2017, 9:47pm

The problem is you are using IndexLookup, which is a wrapper over Lucene APIs. This does it's own caching internally, but does not pass any additional flags to Lucene like _CACHE.

You should instead look at the more recent examples of advanced scripts, which use the Lucene apis directly:
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/modules-scripting-engine.html

system · November 9, 2017, 9:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.