Plugin to call external Rest End point for each resulted doc with exrta parameters from Original Request

Hi,
I was going through online documentation to find if there is any way I can try to solve my use case while using ES for my document search.
My Use case is like this ..

  1. Loaded bunch of document in ES lets say 10k.

  2. Client can search these documents with filters via standard API/SDK/POST while also pass some custome information as separate json object.
    custom Request {
    // ES specific search,
    extraParams:{ "firstParam": "firstValue"
    "secondParam":"secondValue",
    }
    }

  3. Now while processing ES shard node should call a Rest endpoint passing "extraParam" and some fields from resulted document to rest endpoint and rest result will be stored on requested document to be sent in response while returning this document, also if requested by client we can sort the results on this new calculated field also.

  4. This should be supporting pagination also while sorting on this new calculated field.

  5. It will be great if we can batch multiple documents in single post call to REST end point.

Note : I was looking options and found Ingest api which does some of this where we can add new field on the document before indexing but in my case its reverse, where I want to calculate new field on the results of the filtering and return this new field also to requester.

Thanks
Vipin

As far as I know there is no built in way to call out during a search. It however sounds like this would be a very slow and expensive way to search. If you can describe what you are trying to achieve from a high level using this someone might be able to come up with more efficient alternatives.

Hi,
Thanks for response..

High level we are trying to do some processing for filtered documents (ES search Results) along with some other static data ( applicable uniformly to all search results but input from clients ) through a rest call.

Other alternative is to fetch all the filtered data and apply this logic at Orchestration Layer, but in this case we need to fetch all result from ES (approximate 500K) under one sec to achieve our SLA. Which is kind of deep pagination by using Search/scan/scroll. To achieve this amount of fetch under one sec only option is to to a parallel fetch for all the pages of search which seems issue with ES because of..

  1. Search have default fetch size of 10K, even if we change it deep pages will be slow because of large Offset.
  2. Scan/Scroll is giving you cursor for next fetch that means its internally supported for sequential fetch and also we can not move backwards.

Any help of guidance really help us to move forward.

Thanks
Vipin

Performing individual call-outs for several thousand records from within Elasticsearch would probably not be much faster. I still do not understand what you are looking to do based on your description. A more concrete example would be helpful.

Hi,
Our cases is that we need to do some further processing on the filtered result from ES, while whole processing should me completed in say 3-4 secs.
so we have two options ..

  1. Either fetch all the filtered results form ES with in 12 secs. and process retrieved data parallely toachive performance.
  2. Ask ES to use post processing of th resuted data as it does with Ingest API for incoming data.

Thanks
Vipin

What type of post processing do you perform? What is the additional data you are referencing?

If you can not provide a good description of the type of processing you are performing, it is hard for anyone to come up with alternative approaches.

Post processing is :

  • Use Partial data from Document.
  • Some Data from Original Request.
  • Call a rest endpoint which will use above mentioned Data sets and return response. Logic implemented in the rest call is well is implemented to support Batch processing of request under milliseconds.

if that helps.

OK. I see no easy built in way to do that so suspect it may require development of a custom Elasticsearch plugin, but others may have a better idea.

Is there any plugin which is transforming output data from Elastic search for reference, I know it will be slow but we can explore this option.

Hi,
how can we make sure

  • That our search is getting latest indexed document while load is going on, without sort by updated dtm , if there is any inbuilt way to do so.
  • Also how can we make sure to delete some document after some Time to live expiration?

Thanks
Vipin

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.