Replicas out of sync

edamtoft · January 25, 2018, 10:16pm

For one particular index, I've been having issues with the primary and replicas repeatedly getting out of sync.

When updating this index, I make a delete by query request to delete all values with a particular property, followed immediately by a bulk insert to re-add the updated values.

ES is version 5.6 running in a 5-node cluster.

I haven't been able to consistently reproduce it, and I can fix it temporarily by switching the replicas to 0 and back to 1 to get ES to rebuild them, but the issue seems to crop back up after a day or so.

Anyone run into this before?

DavidTurner · January 26, 2018, 8:21am

Hi there,

We have seen very occasional reports of this, and have been investigating, but it has proved extremely tricky for us to reproduce. We need help from someone like you who sees this problem regularly enough to be useful in diagnosis.

Please could you tell us more about this cluster and the environment in which it lives? For instance: what version are you running exactly? What is it running on? How frequently are you doing the bulk-delete-and-insert that you describe? What other activity does the cluster see?

Would you be willing to run the support diagnostics tool on your cluster and share the results? Don't post them here: I'll get you an email address to use if you can run this.

Would you be able to enable the following very verbose logging, and toggle the replica count to 0 and then back to 1 to make sure everything is in sync? I say again that this is very verbose so it will cause extra I/O and may fill up your disks, so proceed with caution here.

, "logger.org.elasticsearch.action.bulk": "TRACE"
, "logger.org.elasticsearch.cluster.service": "DEBUG"
, "logger.org.elasticsearch.indices.recovery": "TRACE"
, "logger.org.elasticsearch.index.shard": "TRACE"

In case it helps, we've only so far been able to reproduce anything like this by simulating some very strange networking failures that coincide with shards being reallocated, and even then it's very sporadic.

bleskes · January 26, 2018, 10:30am

On top of what David suggested, can you share the exact version you use?

edamtoft · January 26, 2018, 3:02pm

The cluster is a 5 node cluster, all running ES 5.4.0 as master/client/data, all Centos7.3 VMs with no plugins. Cluster has about 5 indexes in it, the largest one of which has about 1.6M documents with a fair amount of churn which has never gotten out of sync. It updates by diffing and just performing bulk update/deletes on individual documents which aside from # of documents is the only significant difference between it and the index that is getting out of sync. The index which is causing trouble is pretty new and just has a couple thousand documents, but is updated by just deleting (via delete_by_query) and re-adding groups of documents.

Version details:
"version": {
"number": "5.4.0",
"build_hash": "780f8c4",
"build_date": "2017-04-28T17:43:27.229Z",
"build_snapshot": false,
"lucene_version": "6.5.0"
}

I'll see if I can enable verbose logging. Unfortunately I can only reproduce this at the moment on our production cluster so will have to check about running the diagnostics tool.

edamtoft · January 26, 2018, 3:05pm

The indexing on the client is being done with NEST on .net. Code looks about like this:

      var actions = terms
    .Select(term => new BulkIndexOperation<SearchTerm>(term)
    {
      Routing = term.ClientId
    })
    .Cast<IBulkOperation>();
 
  Client.Instance.DeleteByQuery(new DeleteByQueryRequest("inventory_suggestions", typeof(SearchTerm))
  {
    Query = new TermQuery
    {
      Field = typeof(SearchTerm).GetProperty(nameof(SearchTerm.ClientId)),
      Value = _client.ClientId.ToString()
    }
  });

  Client.Instance.Bulk(new BulkRequest("inventory_suggestions")
  {
    Operations = actions.ToList()
  });

DavidTurner · January 26, 2018, 3:43pm

Thanks, Eric, much appreciated.

edamtoft · January 26, 2018, 5:54pm

We turned on the trace logging and were able to reproduce it getting out of sync on one of the shards. Seems to only be ~15 documents off at the moment. What's the best way to share the logs?

DavidTurner · January 26, 2018, 6:07pm

Please could you zip them up and send them to me at david.turner@elastic.co? I'm unlikely to look at them before 0900 UTC Monday now, so don't promise an immediate response.

edamtoft · January 26, 2018, 7:02pm

Logs sent. Thank you very much for your help.

DavidTurner · January 29, 2018, 2:20pm

Hi Eric,

Thanks for the logs, they're much appreciated. We have come up with one hypothesis about what might possibly be happening here, but unfortunately cannot test it from those logs alone. Could you possibly repeat the period of trace logging with the same settings, starting from a point where the shards are all in sync, wait for them to fall out of sync, and then grab a list of all the document IDs on both primary and replica as well as the logs? Ideally we'd like the indexing process to be stopped and for you to perform a refresh before querying for the doc IDs to make sure that we get everything.

Many thanks,

David

edamtoft · January 29, 2018, 3:40pm

Awesome. We've got the trace logging enabled as before. I'll send you those results once it starts getting out of sync again.

DavidTurner · January 29, 2018, 3:46pm

Thanks Eric. Could you also confirm that, in this index at least, you're using auto-generated IDs, and not using external versioning at all?

Many thanks,

David

edamtoft · January 29, 2018, 3:47pm

Yeah, that's correct.

edamtoft · January 29, 2018, 6:09pm

Sent you logs along with the ids on primary/replica. I stopped all indexing and did a refresh of the index prior to pulling the IDs.

DavidTurner · January 30, 2018, 9:26am

Awesome. We didn't find exactly what we expected, but we weren't far off. It seems there are occasions where you index a document and delete it very soon afterwards (before the indexing operation has even returned to the client), and the indexing and deletion operations are arriving in the wrong order at the replica, and for some reason (still under investigation) they're not being put back in the right order. We can now reproduce this with a single document.

As a workaround for you for now, I think it'd be sufficient to avoid running concurrent deletion and indexing operations on your inventory_suggestions_v1 index. Could you try that?

edamtoft · January 30, 2018, 10:55pm

I put some stuff in place to try to prevent concurrent indexing and it hasn't gotten out of sync since. For a longer term workaround, I'm also changing around the indexing strategy some to do a bit more targeted updated/deletes since I think that would decrease the churn considerably for my use case.

eebbrr · January 31, 2018, 5:50am

It's an old blog post, but I've never seen it documented anywhere else. Read this: https://www.elastic.co/blog/elasticsearch-versioning-support

EDIT: As it seems this kind of "documentation" goes, you want to just skip ahead to the last section titled "Some final words about deletes.". What you need to know is never at the start!

I'd have to guess that you're re-using _id values within the window of ES' deleted document garbage collection process -- which would be unrelated to concurrent deleting/indexing.

There's likely a few solutions, but always using an (index-wide) increasing version number for each new doc is one way to fix this. Probably using ES' auto-generated _ids is another but I haven't confirmed that approach.

Good luck!

DavidTurner · January 31, 2018, 10:46am

The documentation on the relationship between deletes and versioning is indeed quite scarce, and I agree that this should be properly spelled out in the reference manual. It is, however, not relevant in this case.

The OP is not re-using document IDs.

Assigning document IDs based on an external counter is certainly possible, but it's quite tricky to make it robust to all the things that might go wrong in your system, particularly network partitions and GC pauses. Auto-generated IDs allow Elasticsearch to do this for you, so I'd say to use that functionality unless you have a very compelling reason to use externally-assigned IDs.

DavidTurner · January 31, 2018, 10:54am

Great news. Thanks for letting us know.

system · February 28, 2018, 10:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 6.2.4 What should be done when having "out of sync replica" error Elasticsearch	4	1294	August 22, 2018
Nodes Out of Sync Elasticsearch	7	3507	January 5, 2018
Replica is not syncronized Elasticsearch	1	235	July 6, 2017
ES 5.X - Primary and replica shards not in sync Elasticsearch	8	5379	November 1, 2017
Elasticsearch issue sync Elasticsearch	3	771	January 16, 2019

Replicas out of sync

Related topics