Best strategy for re-indexing?

Hey guys!

I have an index in my ES and apparently I've added a new fields in the mapping (on the spring boot application side). Now the application will be released to the client and on the client side we will have to carry out the re-indexing part (read the data, delete the data in elasticsearch and ingest the data again).

What is the current best strategy for this operation? So no data is lost and the downtime is reduced to the minimum.

Im using elasticsearch:7.16.1

I've heard that with version 2.3.4 a new api _reindex is available, but is it the best way to go?

Cherrio.

Hi,

If your only intension is to add new field to the existing documents, then update API is the best way to do.

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

https://divyanij.medium.com/adding-a-new-field-in-an-existing-index-elasticsearch-a3cf25c053fb

But, if you want to delete complete data and ingest data freshly again, then my suggestion is to create data using new index names so that your old data can be deleted once you feel everything is fine in the new index.

Reindexing is used to copy data from one source to another destination.

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

1 Like

You shouldn't need to do anything when adding new fields. Like @dadiasish suggested. Update your existing _mapping with the new fields to ensure correctness (instead of letting ES auto select the "type").
The issue with ES is only when you change an existing field's type. Say from integer to float, etc.
We add new fields to indices all the time and the index will just handle it without any issues.

Just in case you still need/want to reindex with the new mapping to a new index:

  1. change your code to write to the new index with the new mapping. So all new data go to the new index. Old index will not be written to anymore.
  2. reindex from the old/existing documents to the new index with reindex api.

That's it.

1 Like

Can I also add field to the mapping that is defined by dynamic template? That is actually something that I'm using. And also I need that field to have my custom normalyzer added.

Not sure about dynamic template. Never used it.
But it seems to be a way for you to define your own auto "type definition". It's a configuration.

The issue with regard to reindex is only when you change the type of existing field. You don't need to reindex for changing configuration. You will get exception and the document won't be inserted into the index when there's an issue with your new config. It should be very obvious.

The easiest way to test is to have a small test setup (you can run it locally using docker) and test it out yourself with regard to the change you make.
Add a document, change your dynamic template, insert a new document. See if ES complains.

Or using your existing cluster and test it on a new index.

1 Like

Sure, I will post another question for that as this was not originally a topic of this question, thanks guys for help :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.