Text Summarization as a processor in Elastic

Elastic Search Version: 8.11.0

Can anyone guide me on how to use sshleifer/distilbart-cnn-12-6 to summarize a long text field and save the summarized output in documents? Specifically, I want to use it as a processor within an Elasticsearch pipeline.

Hi @Pavit_Kaur1 :

As you checked in this enhancement request, this is not yet available from Elasticsearch itself.

You will need to perform the summarization as part of an external process and then ingest the documents that include the results in Elasticsearch.

Thanks for the reply @Carlos_D. To confirm that I am not missing something, is there any other approach to perform summarization for a long text field directly within Elasticsearch using its machine learning features?

Not at the moment @Pavit_Kaur1 . Elasticsearch provides the ingestion processor for text embedding / sparse embedding, and semantic_text provides automatic embedding generation - but there's no ingestion processor or other mechanism that can be used in Elasticsearch for that directly.

I'm afraid client processing is necessary for calculating summarization for the time being.

1 Like