ML Inference speeds &

Chenko · June 26, 2024, 9:46am

Hello,

I have a question regarding the speed of which I embed my documents.

I currently have an index with 10.000 documents,
My ML node looks like this:

As can be seen, I currently have 4GB of ram & 2vCPU's.

With this setup I embedded the 10.000 documents in 6.5 hours.

My question is; What if I increase the RAM to 16GB which would also increase the vCPU's to 8 vCPUs.
How much faster would it be?
Is it possible to calculate this?
Is it possible to use the same calculation for an even higher upgrade?

Kr, Chenko

iulia · September 18, 2024, 8:49am

Hi!
There are a few variables you can play with to influence the speed/performance.
A general rule of thumb is that it's better to first scale your ML node vertically and make sure to give it enough RAM for your task (or setting up autoscaling).

You can then look at the number of threads and allocations and how changing those up can influence the performance within your available infrastructure. You can play with this with a small dataset and monitor your CPU usage and the speed of processing until you find the best settings for your use case.

Here's a small example in this blog about how allocation strategies can influence the inference time for a model. (and what commands you can use to observe this)

Hope this helps as a starting point!

Topic		Replies	Views
Is there a way to speed up the average inference time? Elasticsearch elastic-stack-machine-learning	12	632	May 20, 2024
How to speed up Language Identification Inference Elasticsearch elastic-stack-machine-learning , ingest-pipeline	6	755	May 27, 2021
Resource Utilization Machine Learning Elasticsearch elastic-stack-machine-learning	8	1605	June 16, 2022
Help to choose hardware Elasticsearch	2	358	March 30, 2020
How to improve ELSERv2 ingest throughput? Elasticsearch elastic-stack-machine-learning	4	285	March 8, 2024

ML Inference speeds &

Related topics