ELSER v2 model (version 12.0.0) inference fail for elasticsearch 9.0.3

I installed elasticsearch 9.0.3 with docker compose file, running on centos7, x86_64 platform, when I call inference api with:

POST _inference/.elser-2-elasticsearch
{
  "input": "What is Elastic?"
}

the elasticsearch backend always return error:

{"@timestamp":"2025-07-03T14:57:14.725Z", "log.level":"ERROR", "message":"[.elser-2-elasticsearch] Error processing results", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#14]","log.logger":"org.elasticsearch.xpack.ml.inference.pytorch.process.PyTorchResultProcessor","elasticsearch.cluster.uuid":"TjvQKf_RR5CQIzbnxYbAZw","elasticsearch.node.id":"Ma0mBGshSECiqSFURRkd9A","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag","error.type":"org.elasticsearch.xcontent.XContentEOFException","error.message":"[3:1] Unexpected end of file","error.stack_trace":"org.elasticsearch.xcontent.XContentEOFException: [3:1] Unexpected end of file\n\tat org.elasticsearch.xcontent.impl@8.17.8/org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:62)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.process.ProcessResultsParser$ResultIterator.hasNext(ProcessResultsParser.java:70)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.inference.pytorch.process.PyTorchResultProcessor.process(PyTorchResultProcessor.java:105)\n\tat org.elasticsearch.ml@8.17.8/org.elasticsearch.xpack.ml.inference.deployment.DeploymentManager.lambda$startDeployment$2(DeploymentManager.java:180)\n\tat org.elasticsearch.server@8.17.8/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1575)\nCaused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: expected close marker for Array (start marker at [Source: (FileInputStream); line: 2, column: 1])\n at [Source: (FileInputStream); line: 3, column: 1]\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:585)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:535)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:552)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd2(UTF8StreamJsonParser.java:3135)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:3105)\n\tat com.fasterxml.jackson.core@2.17.2/com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:716)\n\tat org.elasticsearch.xcontent.impl@8.17.8/org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:59)\n\t... 7 more\n"}

Could anyone help me?

The root cause is:
{"@timestamp":"2025-07-04T01:07:52.204Z", "log.level":"ERROR", "message":"[.multilingual-e5-small-elasticsearch] pytorch_inference/1441 process stopped unexpectedly: Fatal error: 'si_signo 4, si_code: 2, si_errno: 0, address: 0x7fe92b02dce0, library: /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/../lib/libtorch_cpu.so, base: 0x7fe923b4d000, normalized address: 0x74e0ce0', version: 9.0.3 (build d594c219a7b529)\n", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#11]","log.logger":"org.elasticsearch.xpack.ml.process.AbstractNativeProcess","elasticsearch.cluster.uuid":"jm4D-abhRqu2SVtLeGh5dw","elasticsearch.node.id":"cwODTIrpRXSfm1rjjcvyEw","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag"}
{"@timestamp":"2025-07-04T01:07:52.205Z", "log.level":"ERROR", "message":"[.multilingual-e5-small-elasticsearch] inference process crashed due to reason [[.multilingual-e5-small-elasticsearch] pytorch_inference/1441 process stopped unexpectedly: Fatal error: 'si_signo 4, si_code: 2, si_errno: 0, address: 0x7fe92b02dce0, library: /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/../lib/libtorch_cpu.so, base: 0x7fe923b4d000, normalized address: 0x74e0ce0', version: 9.0.3 (build d594c219a7b529)\n]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[es01][ml_native_inference_comms][T#11]","log.logger":"org.elasticsearch.xpack.ml.inference.deployment.DeploymentManager","elasticsearch.cluster.uuid":"jm4D-abhRqu2SVtLeGh5dw","elasticsearch.node.id":"cwODTIrpRXSfm1rjjcvyEw","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"rag"}

I found related issue: Problem with libtorch_cpu library (Pytorch) error: "si_signo 4, si_code: 2, si_errno: 0" · Issue #116756 · elastic/elasticsearch · GitHub, so is it a hardware issue?

The issue in linked JIRA is the CPU didn’t support the AVX (Advanced Vector Extensions) instruction, which the library requires. So you need check if your specific CPU supports those instructions.

lscpu | grep avx

or

grep avx /proc/cpuinfo

If your CPU did support AVX, but getting this error, then check if it's disabled in BIOS (unlikely IMHO, but maybe). also, I dont think the virtual/docker can "hide" these instructions, but again maybe.

EDIT: semi-obvious, but if you tried with a different CPU/system then it would probably work.