Hi Elasticsearch team,
I am currently working with reranker models such as Qwen3-Reranker, and I noticed that in the vLLM community, the rerank API supports custom text templates for queries and documents. This is crucial for models like Qwen3-Reranker, which require specific prompt formatting (see this example ) to achieve optimal results.
Currently, the Elasticsearch inference rerank endpoint does not support passing custom text templates for the inputs. This limits the usability and performance of certain reranker models that rely on prompt engineering.
I have manually constructed the inference_text
field but response error:
// parts of request
"text_similarity_reranker": {
"inference_id": "bailian_rerank",
"inference_text": "<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: Recommend hiking routes around Shanghai?\n",
"field": "route_text_concat",
"rank_window_size": 50,
...
}
// parts of response
"type": "status_exception",
"reason": "[text_similarity_reranker] search failed - retrievers '[rank_docs_retriever]' returned errors. All failures are attached as suppressed exceptions.",