Hi, I am trying to deploy the Elser (built-in) model, and all the configurations seems to be fine. But the deployment has started and it takes forever to complete the deployment

In my case I have run the deployment at 11:27AM EST on 30th of August , but till 11:30 AM EST 31st of August, it still seems to be running.
Also, tried increasing the RAM and space for Elastic search and also for the ML server, but nothing worked.
Also, the same thing happened for the key phrase extraction hugging face model.
Can someone please suggest me on this?

Hi,

Can you please write down the steps you have tried, and describe which part is not working. A few screenshots would be helpful too.

Thanks.

Hi Wang,

Thanks for reaching out. I'm unable to post any of the screenshots and to mention my issue, I have downloaded the elser built in model
from the Machine Learning -> Trained Model section and once after it's downloaded, I tried deploying by clicking on "Start deployment" option by setting the "Number of Allocations to 1" and "Threads per Allocation to 1"

It was running for 1 complete day i.e ; 24 hours(From 11:27AM EST on 30th of August until 11:30 AM EST 31st of August) and I manually tried to stop the deployment as it was taking so much time.
The configurations used while deploying the elser model was :
Elastic search space and RAM : 90 GB Storage | 2GB RAM | Upto 2.5 vCPU and Availability zones are set to 3
Machine Learning node space and RAM : 4 GB RAM|2 vCPU Upto 8 vCPU and Availability zones are set to 3

Please let me know if you need any further information and also if there is any resolution, please let me know the same.

Thanks,
Manasa.

Hi Manasa,

did you see the below message after you started deployment? and what is the state shown on your screen?

You also can check ml node state from Machine Learning -> Memory Usage page, or sending this request GET _ml/trained_models/.elser_model_1/_stats through dev tools console.

thanks.

Yes! I did see the message saying "Deployment has started" for the respective elser_model.
The stats were also showing it's running with 0 errors.
Not sure, why is it taking that long time because tere are no errors comping up while the deployment is going on.

Silly question ... what exactly is taking a long time are you running an inference pipeline?

Once the ELSER model is deployed and started (as shown above) it does not do anything until you use the inference pipeline to do the text expansion.

Have you created a pipeline and then posted documents through it?

If you are interested here is an end-to-end Notebook that you can try.

P.S. I am bvader

The deployment itself is not happening. It's taking forever to complete.

And the above issue still continues.
But just to check if it's the same thing happening on the other cluster too, I have created a new cluster and tried deploying the elser model with these configurations :
"Number of Allocations to 1" and "Threads per Allocation to 1"
Elastic search space and RAM : 90 GB Storage | 2GB RAM | Upto 2.5 vCPU and Availability zones are set to 3
Machine Learning node space and RAM : 4 GB RAM|2 vCPU Upto 8 vCPU and Availability zones are set to 3

And I'm facing the below error :
The elser model is again failing with this issue on the new cluster :

{ "statusCode": 429, "error": "Too Many Requests", "message": "[circuit_breaking_exception\n\tRoot causes:\n\t\tcircuit_breaking_exception: [parent] Data too large, data for [<http_request>] would be [1052402776/1003.6mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1052402776/1003.6mb], new bytes reserved: [0/0b], usages [model_inference=518432/506.2kb, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=0/0b]]: [parent] Data too large, data for [<http_request>] would be [1052402776/1003.6mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1052402776/1003.6mb], new bytes reserved: [0/0b], usages [model_inference=518432/506.2kb, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=0/0b]", "attributes": { "body": { "error": { "root_cause": [ { "type": "circuit_breaking_exception", "reason": "[parent] Data too large, data for [<http_request>] would be [1052402776/1003.6mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1052402776/1003.6mb], new bytes reserved: [0/0b], usages [model_inference=518432/506.2kb, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=0/0b]", "bytes_wanted": 1052402776, "bytes_limit": 1020054732, "durability": "TRANSIENT" } ], "type": "circuit_breaking_exception", "reason": "[parent] Data too large, data for [<http_request>] would be [1052402776/1003.6mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1052402776/1003.6mb], new bytes reserved: [0/0b], usages [model_inference=518432/506.2kb, eql_sequence=0/0b, fielddata=0/0b, request=0/0b, inflight_requests=0/0b]", "bytes_wanted": 1052402776, "bytes_limit": 1020054732, "durability": "TRANSIENT" }, "status": 429 } } }

Please help me on this.

Hi @Manasa4

Can you provide a screenshot please of that trained model screen.

What command did you run resulted in the error above?

Is it when you tried to load a document through in the inference pipeline?

Did you try to load an extremely large document?

That error basically says the request exceeded that JVM heap.

Is there any other activity going on on that cluster? Is it busy with many other tasks?

I would try making the Data Nodes 4GB

You could also try setting the Allocations and threads both to 2 when you load the model.

@Manasa4 thanks for reporting the problem, I've opened an issue in the Elasticsearch repository to track the issue.

We are working on a fix, in the meantime I would do as @stephenb suggested and increase the data node size to 4GB.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.