Update a field

Hi,

I have an index which has documents of all the commits of my repo.

There is a field in the document which has different format therefore while aggregating on that field yields different result.

The field name is Author.

Different format example

format 1 : saram ali<saram.ali@email.com>
format 2: Saram Ali <saram.ali@email.com>

Since they are both the same author with different format in the document field.
Is there any way I can only extract the email from this field and update this field. I have 1.5 Million documents having this field.

Did you mean using Update By Query API?

That could be one of the propositions, I'm looking for an efficient solution also if it's possible or not as well.

Since the records are 1.5 million

You have to reindex those documents anyway. Depending and the whole size of the dataset a full reindex in a new index might be better.

For updating what do you suggest, how should I update the field while reindexing.

how should I update the field while reindexing.

You probably want to apply some text transformation. You can use a Painless Script while updating to apply whatever rule you need.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.