Should insert to new index or update existed documents with large data in ElasticSearch?

Hello,
I have a task,

  1. Read very large log from agent send to elasticsearch (i use fluent-bit) in real-time
  2. But, I need modify each record to add my tags by specific condition on each documents (function add_tags(doc) wrote by Python), So I need scheduler a job do:
  • Get all documents not yet tagged from Elasticsearch
  • Run throught the method add_tags to add field tags to document
  • Put again to ElasticSearch

In last step, i need put again to ES, so, if I updated existed documents, is bad perfomance ?
Or should to insert to new index ?

Thanks all.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.