Accessing number of matching document inside a script

Hi,
I developped a custom plugin to modify the scores, according to some internal criteria, of documents matching a user query.
My problem is : the function that modifies the score is very expensive and I do not want to run it when the user query matches a very large number of documents.
I am wondering how to directly get this number of matching documents without putting a counter.

private static class MyExpertScriptEngine implements ScriptEngine {
   
    //choosen arbitrarily
    public static final double THRESHOLD = 10000;
   
    @Override
    public <T> T compile(String scriptName, String scriptSource, ScriptContext<T> context, Map<String, String> params) {
        // we use the script "source" as the script identifier
        if ("pure_df".equals(scriptSource)) {
            SearchScript.Factory factory = (p, lookup) -> new SearchScript.LeafFactory() {
		if (numberOfMatchingDocuments > THRESHOLD ){
                    throw InternalException("Cannot apply the expensive cost function because the number of matching documents is greater than " + THRESHOLD );	
                }     

                @Override
                public SearchScript newInstance(LeafReaderContext context) throws IOException {
                    return new SearchScript(p, lookup, context) {
                        
                        @Override
                        public double runAsDouble() {
                           // run our expensive function to modify the score of matching documents. 
                            
                    	};              
            };
            return context.factoryClazz.cast(factory);
        }
    }
}


POST /_search
{
  "query": {
    "function_score": {
      "query": {
        user_query
      },
      "functions": [
        {
          "script_score": {
            "script": {
                "source": "pure_df",
                "lang" : "expert_scripts",
                "params": {
                    "param1": "value1",
                    "params2": "value2"
                }
            }
          }
        }
      ]
    }
  }
}

Thanks for your help

If your "expensive" function is packaged as a query you could looking at using rescoring to limit the number of docs you apply this logic to

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.