ELSER - use the model outside of a pipeline

paulmaker · August 29, 2023, 4:16pm

Hi all,

We are exploring the new ELSER features and how we integrate this into our application. We have a two stage ingestion approach that adds the text during the second phase. Therefore, using an ingestion pipeline is not ideal as from reading it looks like we would need to reindex to get the pipeline to fire on documents (unless you can get a pipeline to fire when a document is updated?).

Is it possible to use the model directly either via an API or directly in Python (like other Huggingface models).

Alternatively I guess we could invoke the simulate API on the pipeline and then update our documents with results of that - however, that feels like a bit of a hack.

Thanks in advance.

-P

stephenb · August 29, 2023, 4:50pm

Hi @paulmaker Welcome to the community...

Can you explain Phase 2 a little better....

My initial thought is you could set a default pipeline for the index ...

That would check if your text field exists and if so then run the ELSER inference processor

Phase 1 field does not exist so does not execute

Phase 2 text field exists so the inference processor runs

That is just the initial thought.. I think this would address, as an update is really just a soft delete and index.

"unless you can get a pipeline to fire when a document is updated?."

Give it a try and report back...

Ohh There is a direct API I just saw so looks you can call directly...

So now you have 2 options

paulmaker · August 30, 2023, 9:13am

Thanks for the reply, I have got this all working.

What are the plans to increase the size of the text that can be processed (current limit is 512 tokens). Have engineering considered chunking the text and then accumulating the feature scores (average of max).

stephenb · August 30, 2023, 2:31pm

I understand engineering is / will be continuing to add features and capabilities but I can not comment on them at this time / we don't communicate future features on this forum.

system · September 27, 2023, 2:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Use ELSER on data already in elastic Elasticsearch elastic-stack-machine-learning	2	47	February 4, 2025
Using ELSER for multiple fields Elasticsearch elastic-stack-machine-learning	2	927	September 21, 2023
Performance and Efficiency for Indexing Using Machine Learning Models Elasticsearch elastic-stack-machine-learning	3	89	July 31, 2024
Why doesn't elser_model_1 generate the Vectors during index? Elasticsearch elastic-stack-machine-learning	4	326	November 24, 2023
Issue while running a pipeline Elasticsearch ingest-pipeline	2	323	January 15, 2024

ELSER - use the model outside of a pipeline

Related topics