Groovy and Replica performance issue

liorisme6 · November 13, 2017, 8:34am

Hi there,

I'll start by confessing that we're still using an ancient version of ES (1.7.2) - please don't laugh
We have plans to upgrade real soon.

We receive about 30k events every minute with the following data:

timestamp
some id
status for this id
some additional fields

The events may include new ids, but typically they refer to existing ones.
We need to index the data in ES in a way that we'll be able to calculate the duration of the status for each id. That means that we need to keep the timestamp of the new event only if the id is new, or if the status for this id has changed.

We tackled this by using a simple Groovy script for updating the documents. However, when we set the replication factor to 1, we saw that the indexing time got much higher, in a way that there were rejects for our bulk update (that also includes other events). However, we cannot work with replication, as in such a case we'll lose data (and, worse than that, will influence the indexing time of other events in the same bulk).

We prefer not to change the queue_size and the bulk_size, and we cannot have two separate bulks (for this event and for other event type).

I was wondering if you have any better suggestion for modeling this problem, or any ideas how to tackle the replication performance issue.
Is there any was to postpone the replication (we're fine with that)? Or to make it more efficient?

Thanks a lot!

system · December 11, 2017, 8:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk Rejection while increasing replicas for an index Elasticsearch	1	836	May 30, 2017
Updating documents in an index with nested types Elasticsearch	2	426	November 12, 2018
Extremely slow writing to replicas Elasticsearch	6	2120	August 30, 2018
Async replication deprecated Elasticsearch	7	4144	July 6, 2017
Index speed is greatly reduced when doing replication Elasticsearch	7	503	April 23, 2019

Groovy and Replica performance issue

Related topics