Continuously copy specific documents to another index?

Thomas_D1 · November 17, 2020, 10:06pm

Hi All,

we're using Elasticsearch for storing our application logs using daily indexes, which we delete after two weeks. Now we would like to keep all logs with the loglevel "error" for another 2 months.

We thought about using a separate index for all error messages and apply the different deletion rules there.

However we don't want to add the logic regarding in which index to write ("normal" or the ones for the errors) in each of our app servers logstash configs.

Is there a way to stream / continuously copy specific documents (the ones with level "error") from one index to another?

It seems like the ReIndex API could be used for that.

Is the right way to schedule a job which runs every 5 minutes or so with the ReIndex command?
Does ReIndex check which documents already exist and skips those?
If we run the script every 5 mins, we probably can also add a "query" field to limit the documents compared to the last 5 mins.

Or is there a better way?

Thanks!

Thomas

warkolm · November 17, 2020, 10:26pm

You should be able to do this with Alerting, using a search input and an index action.
That way you can schedule it to run once a day, for eg.

Thomas_D1 · November 18, 2020, 6:03am

Thanks for the reply! Unfortunately we only have the open source version for various reasons, is there any way to do it without the X-Pack features?

warkolm · November 18, 2020, 6:07am

That'd have to be a reindex with an external trigger.

Thomas_D1 · November 18, 2020, 6:46am

Ok, so we would setup a script to run every few mins, call ReIndex with a query to only match the errors.

Two last questions:

Will ReIndex make sure matching documents are not copied twice? Thats what I understand from the docs.
We should probably also include the timestamp in the query to limit the results to the last X minutes (basically the script interval + a few mins).

Thanks a lot for your help!

warkolm · November 18, 2020, 7:15am

Can you link to that?

Yes, ideally.

Thomas_D1 · November 18, 2020, 7:38am

Setting version_type to external causes Elasticsearch to preserve the version from the source, create any documents that are missing, and update any documents that have an older version in the destination than they do in the source.

I understand that this means, that documents which already exist in the target index will be updated if there is a newer version available, but not inserted twice.

warkolm · November 18, 2020, 8:58pm

Yep!

Thomas_D1 · November 18, 2020, 9:28pm

I'll try it out, thanks for your help!

system · December 16, 2020, 9:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Copying documents between indices Elasticsearch	3	335	July 6, 2017
Copy docs from one index to another index Elasticsearch	22	28308	July 5, 2017
Combine 2 indexes into one Elasticsearch	4	3135	July 6, 2017
How to reindex an ES index Elasticsearch	11	1162	July 6, 2017
Suggestions for reindexing individual documents Elasticsearch	4	357	July 6, 2017

Continuously copy specific documents to another index?

Related topics