Reindex data with a new Field

Hi,

Is there anyway to reindex the data and create a new field, from the data in the documents. For example, reindex an entire index and create a new field Hour(number) from a date type field(YYYY-mm-dd, HH-mm-ss). I Know that Lucene has an expresion for this(doc['@timestamp'].date.getHourOfDay()), but can this be used to create a new field when you reindex?

I need this new field for a search query and the scripted fields can't be used for this, the alternative is using a script on the query but it can be bad for the search time if the number of documents searched continue to increase.

There is an example of using reindex to modify the document in the docs here. It changes the name of a field but you can use the same script construct to do what you want to do. I suspect the script looks something like "script": "ctx._source['timestamp_hour'] = ctx._source['@timestamp'].getHourOfDay(). You are right that it will be much better at search time to use this new field. Watch the time zone, btw. I believe the hour that you get in this case is UTC.

You could also do this with _update_by_query because it just introduces a new field which is a change you can do on an existing index. You'll end up with deleted documents in your index but merge should remove them in time. Not all of them, but a bunch. That might be easier to deal with depending on what you are doing.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.