How can I avoid existing semantic query slowdown during re-indexing using ELSER model in Elastic Search V8

sudom0nk · July 3, 2024, 1:38pm

We are using Elastic ELSER model for Semantic Search. I follow this doc to create my semantic tokens in the index: Tutorial: semantic search with ELSER | Elasticsearch Guide [8.14] | Elastic

The Problem: The search data changes every day for our service. Because of this, I have to reindex the newly pushed data to generate the tokens for semantic search. When this re-indexing is running using the ELSER model, the search on existing data also slows down due to model involvement in reindexing. How can this be avoided?

Sean_Story · July 3, 2024, 1:44pm

Hi @sudom0nk ,

I see that you also asked this question in the community slack. In the future, please only ask in one place, so that we can avoid duplicate efforts to triage and answer your questions.

I'll copy the responses from there. From @Quentin_Pradet :

This is expected especially if both queries and reindexing are using the same model deployment and thus "fighting" for the same resource. Options to address this include:

Throttling your reindex to use less queries per second: Reindex API | Elasticsearch Guide [8.14] | Elastic. If it's still running, you can also use the _rethrottle API.

Make sure your deployment has enough allocation and threads. More threads can especially help with query performance: Deploy the model in your cluster | Machine Learning in the Elastic Stack [8.14] | Elastic

Consider adding more ML nodes to increase overall capacity. On Elastic Cloud, ML nodes support autoscaling, which can help by adding more capacity during reindexing

from myself:

Another option (I believe) is to deploy ELSER twice. You can use one model_id for ingest, and another for search. This ensures that your search queue is not flooded with ingest requests during periods of heavy ingest, or vice-versa.

Alex_Salgado-Elastic · July 3, 2024, 1:53pm

Another help and question is, do you need to reindex new data or just insert new data?

sudom0nk · July 5, 2024, 10:15am

Noted.
Thank you @Sean_Story for the help.

sudom0nk · July 5, 2024, 10:16am

Basically insert new data for semantic search.

Alex_Salgado-Elastic · July 5, 2024, 11:28am

So, if I understood correctly, you don't need to reindex every time you insert a new record; just insert the new record.

Topic		Replies	Views
Performance and Efficiency for Indexing Using Machine Learning Models Elasticsearch elastic-stack-machine-learning	3	56	July 31, 2024
Performing semantic searches - ELSER Elasticsearch	3	243	March 30, 2024
Reindexing in Production Environment Kibana reindex	2	528	March 15, 2023
Context Error During Reindex with Elser Elasticsearch elastic-stack-machine-learning , docker , painless , ingest-pipeline , esre-elasticsearch-relevance-engine	5	205	June 25, 2024
Elser service can't index large amount of data Elastic Search	3	56	November 7, 2024

How can I avoid existing semantic query slowdown during re-indexing using ELSER model in Elastic Search V8

Related topics