Is there any way to deliver information from ElasticSearch "in real time"?

What am i doing:
Parsing 50-100 GB of information per day from API (most part of information are duplicated, stored less than 1/1000 or even 1/10000 of information).

Is there any chance that ElasticSearch have a build-in alghoritm that can deliver information in real time to the clients with specific quary? If no, how would you do that?

PS Currently i'm in a planning stage of my development environment. Planning to use Node.js + MongoDB (for static) + ElasticSearch, but i might change it if there are better way to implement that feature.

It looks like to me what percolator feature is built for but you will have to write some code to have the exact feature you want as it does not OOTB streaming data to users.

It's "just" a system which compares a document to previously registered queries.

I do understand that i will have to write code 100% and i'm not scared of it. The question is almost "How to do that most efficient way?". I have only 1 idea - send data to the clients when i parse them via WebSocket, but that is seems not very good and very resource-intensive. Well, i can say that i'm a student and i don't have expirience of building that kind of architecture. I'm scared to create a poop (sorry). Can you help me with architechure a little bit?

I'd do:

  • Create percolators: queries with other metadata like the client id.
  • Pass the documents to the percolate query
  • Get back the list of queries which are matching. Get back the id.
  • Stream the document through the right websocket to the right client.

I'll be happy to see what you will come with if you end up writing something like this.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.