Hi Team,
Before filing an issue, I was wondering if I could get some assistance in trying to test inference on the ELSER v2 model in case I am missing something in the docs
Elasticsearch version: 8.17.2
ML Nodes: 1
Instance Type: Amazon r7gd.xlarge instance (ARM64 arch, 4 vCPUs, 32 GB RAM)
OS: Ubuntu 22.04.4 LTS
First Question - Do ELSER/E5 models support ARM64-based architectures. I didn't find any documentation explicitly stating it either way but a similar topic had issues deploying these models on similar EC2 ARM64 instances
Second, here are the steps I ran to setup the inference and ELSER v2 model, and the error I am getting -
-
After spinning-up and provisioning my node with the appropriate roles and ml settings, I started the free trial (
_license/start_trial?acknowledge=true
) -
Querying
_ml/trained_models?pretty
, the only installed model I have islang_ident_model_1
for identifying language -
I created an inference endpoint using the API -
_inference/sparse_embedding/my-elser-model
-
The inference endpoint was created successfully and I see the model imported/started in the logs, though I see a few warnings triggered by
OjAlgoUtils
aroundojAlgo includes a small set of predefined hardware profiles none of which were deemed suitable for the hardware you're currently using.
{"type": "server", "timestamp": "2025-03-03T13:30:01,349Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "current.health=\"GREEN\" message=\"Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.ml-inference-native-000002][0]]]).\" previous.health=\"YELLOW\" reason=\"shards started [[.ml-inference-native-000002][0]]\"", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" , "current.health":"GREEN", "message":"Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.ml-inference-native-000002][0]]]).", "previous.health":"YELLOW", "reason":"shards started [[.ml-inference-native-000002][0]]" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,637Z", "level": "INFO", "component": "o.e.x.m.p.a.TransportLoadTrainedModelPackage", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[.elser_model_2] finished model import after [5] seconds", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,948Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "ojAlgo includes a small set of predefined hardware profiles,", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,949Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "none of which were deemed suitable for the hardware you're currently using.", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,949Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "A default hardware profile, that is perfectly usable, has been set for you.", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,949Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "You may want to set org.ojalgo.OjAlgoUtils.ENVIRONMENT to something that", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,949Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "better matches the hardware/OS/JVM you're running on, than the default.", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,950Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "Additionally it would be appreciated if you contribute your hardware profile:", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,950Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "https://github.com/optimatika/ojAlgo/issues", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:05,950Z", "level": "INFO", "component": "stdout", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "Architecture=aarch64 Threads=4 Memory=16517169152", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:30:06,112Z", "level": "INFO", "component": "o.e.x.m.i.d.DeploymentManager", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[my-elser-model] Starting model deployment of model [.elser_model_2]", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
- Re-querying
_ml/trained_models?pretty
, I can see.elser_model_2
listed now
{
"model_id" : ".elser_model_2",
"model_type" : "pytorch",
"model_package" : {
"packaged_model_id" : "elser_model_2",
"model_repository" : "https://ml-models.elastic.co",
"minimum_version" : "11.0.0",
"size" : 438123914,
"sha256" : "2e0450a1c598221a919917cbb05d8672aed6c613c028008fedcd696462c81af0",
"metadata" : { },
"tags" : [ ],
"vocabulary_file" : "elser_model_2.vocab.json"
},
"created_by" : "api_user",
"version" : "12.0.0",
"create_time" : 1741008600166,
"model_size_bytes" : 0,
"estimated_operations" : 0,
"license_level" : "platinum",
"description" : "Elastic Learned Sparse EncodeR v2",
"tags" : [
"elastic"
],
"metadata" : { },
"input" : {
"field_names" : [
"text_field"
]
},
"inference_config" : {
"text_expansion" : {
"vocabulary" : {
"index" : ".ml-inference-native-000002"
},
"tokenization" : {
"bert" : {
"do_lower_case" : true,
"with_special_tokens" : true,
"max_sequence_length" : 512,
"truncate" : "first",
"span" : -1
}
}
}
},
"location" : {
"index" : {
"name" : ".ml-inference-native-000002"
}
}
}
- Running
_infer
against the model results in the following error as well as the elasticsearch service crashing/restart
curl -X POST "<myhost>/_ml/trained_models/.elser_model_2/_infer" -H 'Content-Type: application/json' -d'
{
"docs": [{ "text_field": "This is a test sentence" }]
}
'
{"type": "server", "timestamp": "2025-03-03T13:32:47,656Z", "level": "ERROR", "component": "o.e.x.m.p.AbstractNativeProcess", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[my-elser-model] pytorch_inference/18123 process stopped unexpectedly: ", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:47,730Z", "level": "ERROR", "component": "o.e.x.m.i.d.DeploymentManager", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[my-elser-model] inference process crashed due to reason [[my-elser-model] pytorch_inference/18123 process stopped unexpectedly: ]", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:47,731Z", "level": "INFO", "component": "o.e.x.m.i.d.DeploymentManager", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "Inference process [my-elser-model] failed due to [[my-elser-model] pytorch_inference/18123 process stopped unexpectedly: ]. This is the [1] failure in 24 hours, and the process will be restarted.", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:47,732Z", "level": "INFO", "component": "o.e.x.m.i.d.DeploymentManager", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[my-elser-model] Starting model deployment of model [.elser_model_2]", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:47,656Z", "level": "ERROR", "component": "o.e.x.m.i.p.p.PyTorchResultProcessor", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[my-elser-model] Error processing results", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" ,
"stacktrace": ["org.elasticsearch.xcontent.XContentEOFException: [3:1] Unexpected end of file",
"at org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:62) ~[?:?]",
"at org.elasticsearch.xpack.ml.process.ProcessResultsParser$ResultIterator.hasNext(ProcessResultsParser.java:70) ~[?:?]",
"at org.elasticsearch.xpack.ml.inference.pytorch.process.PyTorchResultProcessor.process(PyTorchResultProcessor.java:105) ~[?:?]",
"at org.elasticsearch.xpack.ml.inference.deployment.DeploymentManager.lambda$startDeployment$2(DeploymentManager.java:180) ~[?:?]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956) ~[elasticsearch-8.17.2.jar:?]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]",
"at java.lang.Thread.run(Thread.java:1575) ~[?:?]",
"Caused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: expected close marker for Array (start marker at [Source: (FileInputStream); line: 2, column: 1])",
" at [Source: (FileInputStream); line: 3, column: 1]",
"at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:585) ~[?:?]",
"at com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:535) ~[?:?]",
"at com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:552) ~[?:?]",
"at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd2(UTF8StreamJsonParser.java:3135) ~[?:?]",
"at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:3105) ~[?:?]",
"at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:716) ~[?:?]",
"at org.elasticsearch.xcontent.provider.json.JsonXContentParser.nextToken(JsonXContentParser.java:59) ~[?:?]",
"... 7 more"] }
{"type": "server", "timestamp": "2025-03-03T13:32:49,752Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "stopping ...", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:49,753Z", "level": "INFO", "component": "o.e.c.f.AbstractFileWatchingService", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "shutting down watcher thread", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
{"type": "server", "timestamp": "2025-03-03T13:32:49,780Z", "level": "ERROR", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "dev-purple", "node.name": "purple-master-0", "message": "[controller/17854] [CDetachedProcessSpawner.cc@193] Child process with PID 18123 was terminated by signal 9", "cluster.uuid": "tFP7DsTIR-OYrg24tRT_WA", "node.id": "dyWTELgFRsueu3XzQfrmtA" }
- It's worth calling out that the
.ml-inference-native-000002
index assigned toinference_config.text_expansion.vocabulary.index
is empty ... is that expected?
I can provide further logs if that will help troubleshoot the issue. Was going to test on a non-ARM64 architecture to see if the problem still arises
Thanks in advance,
Bryan W.