ELSER v2 model (version 12.0.0) inference fail for elasticsearch 9.0.3

chenlizhao · July 3, 2025, 3:05pm

I installed elasticsearch 9.0.3 with docker compose file, running on centos7, x86_64 platform, when I call inference api with:

POST _inference/.elser-2-elasticsearch
{
  "input": "What is Elastic?"
}

the elasticsearch backend always return error:

{"@timestamp":"2025-07-03T14:57:14.725Z", "log.level":"ERROR", "message":"[.elser-2-elasticsearch] Error processing results", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#14]","log.logger":"org.elasticsearch.xpack.ml.inference.pytorch.process.PyTorchResultProcessor","elasticsearch.cluster.uuid":"TjvQKf_RR5CQIzbnxYbAZw","elasticsearch.node.id":"Ma0mBGshSECiqSFURRkd9A","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag","error.type":"org.elasticsearch.xcontent.XContentEOFException","error.message":"[3:1] Unexpected end of file","error.stack_trace":"org.elasticsearch.xcontent.XContentEOFException: [3:1] Unexpected end of file\n\tat org.elasticsearch.xcontent.impl@8.17.8/org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:62)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.process.ProcessResultsParser$ResultIterator.hasNext(ProcessResultsParser.java:70)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.inference.pytorch.process.PyTorchResultProcessor.process(PyTorchResultProcessor.java:105)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.inference.deployment.DeploymentManager.lambda$startDeployment$2(DeploymentManager.java:180)\n\tat org.elasticsearch.server@8.17.8/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1575)\nCaused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: expected close marker for Array (start marker at [Source: (FileInputStream); line: 2, column: 1])\n at [Source: (FileInputStream); line: 3, column: 1]\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:585)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:535)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:552)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd2(UTF8StreamJsonParser.java:3135)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:3105)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:716)\n\tat org.elasticsearch.xcontent.impl@8.17.8/org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:59)\n\t... 7 more\n"}

Could anyone help me?

chenlizhao · July 4, 2025, 1:32am

The root cause is:
{"@timestamp":"2025-07-04T01:07:52.204Z", "log.level":"ERROR", "message":"[.multilingual-e5-small-elasticsearch] pytorch_inference/1441 process stopped unexpectedly: Fatal error: 'si_signo 4, si_code: 2, si_errno: 0, address: 0x7fe92b02dce0, library: /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/../lib/libtorch_cpu.so, base: 0x7fe923b4d000, normalized address: 0x74e0ce0', version: 9.0.3 (build d594c219a7b529)\n", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#11]","log.logger":"org.elasticsearch.xpack.ml.process.AbstractNativeProcess","elasticsearch.cluster.uuid":"jm4D-abhRqu2SVtLeGh5dw","elasticsearch.node.id":"cwODTIrpRXSfm1rjjcvyEw","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag"}
{"@timestamp":"2025-07-04T01:07:52.205Z", "log.level":"ERROR", "message":"[.multilingual-e5-small-elasticsearch] inference process crashed due to reason [[.multilingual-e5-small-elasticsearch] pytorch_inference/1441 process stopped unexpectedly: Fatal error: 'si_signo 4, si_code: 2, si_errno: 0, address: 0x7fe92b02dce0, library: /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/../lib/libtorch_cpu.so, base: 0x7fe923b4d000, normalized address: 0x74e0ce0', version: 9.0.3 (build d594c219a7b529)\n]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#11]","log.logger":"org.elasticsearch.xpack.ml.inference.deployment.DeploymentManager","elasticsearch.cluster.uuid":"jm4D-abhRqu2SVtLeGh5dw","elasticsearch.node.id":"cwODTIrpRXSfm1rjjcvyEw","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag"}

I found related issue: Problem with libtorch_cpu library (Pytorch) error: "si_signo 4, si_code: 2, si_errno: 0" · Issue #116756 · elastic/elasticsearch · GitHub, so is it a hardware issue?

RainTown · July 4, 2025, 8:41am

The issue in linked JIRA is the CPU didn’t support the AVX (Advanced Vector Extensions) instruction, which the library requires. So you need check if your specific CPU supports those instructions.

lscpu | grep avx

or

grep avx /proc/cpuinfo

If your CPU did support AVX, but getting this error, then check if it's disabled in BIOS (unlikely IMHO, but maybe). also, I dont think the virtual/docker can "hide" these instructions, but again maybe.

EDIT: semi-obvious, but if you tried with a different CPU/system then it would probably work.

Topic		Replies	Views
ELSER v2 model inference crashing Elasticsearch	3	252	December 2, 2025
Pytorch_inference silenty disappear during reindex using pretrained machine learning model Elasticsearch elastic-stack-machine-learning , docker	5	495	June 21, 2023
Elastic search trained model inference not working Elasticsearch docker	5	478	January 14, 2025
ELSER v2 model inference getting stuck while inferencing Elasticsearch elastic-stack-machine-learning	2	100	August 9, 2024
Inference process generates pytorch error Elastic Search	3	164	July 8, 2024

ELSER v2 model (version 12.0.0) inference fail for elasticsearch 9.0.3

Related topics