Text Summarization as a processor in Elastic

Pavit_Kaur1 · December 2, 2024, 7:23am

Elastic Search Version: 8.11.0

Can anyone guide me on how to use sshleifer/distilbart-cnn-12-6 to summarize a long text field and save the summarized output in documents? Specifically, I want to use it as a processor within an Elasticsearch pipeline.

Carlos_D · December 2, 2024, 8:50am

Hi @Pavit_Kaur1 :

As you checked in this enhancement request, this is not yet available from Elasticsearch itself.

You will need to perform the summarization as part of an external process and then ingest the documents that include the results in Elasticsearch.

Pavit_Kaur1 · December 2, 2024, 10:31am

Thanks for the reply @Carlos_D. To confirm that I am not missing something, is there any other approach to perform summarization for a long text field directly within Elasticsearch using its machine learning features?

Carlos_D · December 2, 2024, 2:07pm

Not at the moment @Pavit_Kaur1 . Elasticsearch provides the ingestion processor for text embedding / sparse embedding, and semantic_text provides automatic embedding generation - but there's no ingestion processor or other mechanism that can be used in Elasticsearch for that directly.

I'm afraid client processing is necessary for calculating summarization for the time being.

Topic		Replies	Views
Does FSCrawler support chunking? Elastic Search crawler	8	116	October 4, 2024
Dec 19th, 2019 [EN][Elasticsearch] Simplifying Ingest Pipelines with the new Enrich Processor Advent Calendar	1	1822	November 4, 2022
Is it necessary to use Ingest Attachment Processor to index pdf files Elasticsearch	28	2355	November 9, 2018
Summarization of text content Elasticsearch	1	637	May 5, 2020
NLP Summarizer Model for ES 8.2.2 Elasticsearch elastic-stack-machine-learning	3	707	July 7, 2022

Text Summarization as a processor in Elastic

Related topics