Synced Flush Causes Node to Restart

Harlin_ES · June 23, 2015, 10:01pm

I was attempting to do a cluster wide restart on my Elasticsearch cluster by first doing a synced flush, but ran into problems. The sync caused one of our nodes to randomly restart, sending our cluster into a time-consuming recovery phase.

Before the sync call I set: "cluster.routing.allocation.enable": "none", in preparation for the restart.

I am running an 8 node cluster. All nodes are master + data nodes. We have a total of 4TB of data across the cluster.

Has anybody else ran into issues with synced flushes?

Thanks in advanced for any help,
Harlin

warkolm · June 24, 2015, 5:52am

What version, did you check your logs?

This isn't something I've seen.

Harlin_ES · June 24, 2015, 4:04pm

The logs only show the shutdown, not much else:

[2015-06-23 20:36:21,134][INFO ][discovery.ec2 ] [Leech] master_left [[Wizard][C4feeCcxQQKuCx8eItWuAQ][ip-10-70-104-157][inet[/10.70.104.157:9300]]{master=true}], reason [shut_down]
[2015-06-23 20:36:21,136][WARN ][discovery.ec2 ] [Leech] master left (reason = shut_down), current nodes: {[Shriker][qjQLhrV4QGW69gje_bbEgg][ip-10-70-113-90][inet[/10.70.113.90:9300]]{master=true},[Blur][AXei-iNGQYOq9zqIMGW1XQ][ip-10-70-113-89][inet[/10.70.113.89:9300]]{master=true},[Thunderstrike][8jnth1NDRECV_VRxhQKcLw][ip-10-70-69-119][inet[/10.70.69.119:9300]]{master=true},[Jack Kirby][6hYvIZ9PQ4ah9bgPeYjNOA][ip-10-70-113-91][inet[/10.70.113.91:9300]]{master=true},[Armand Martel][6O2SniZ-TGO7_p193-hiwA][ip-10-70-113-94][inet[/10.70.113.94:9300]]{master=true},[Manikin][a2j-vPTRTPG0GjM5VA4N5w][ip-10-70-113-93][inet[/10.70.113.93:9300]]{master=true},[Leech][44XSwmYqS2e2kBEFc8O7RQ][ip-10-70-113-92][inet[/10.70.113.92:9300]]{master=true},}
[2015-06-23 20:36:21,137][INFO ][cluster.service ] [Leech] removed {[Wizard][C4feeCcxQQKuCx8eItWuAQ][ip-10-70-104-157][inet[/10.70.104.157:9300]]{master=true},}, reason: zen-disco-master_failed ([Wizard][C4feeCcxQQKuCx8eItWuAQ][ip-10-70-104-157][inet[/10.70.104.157:9300]]{master=true})
[2015-06-23 20:36:28,715][INFO ][cluster.service ] [Leech] new_master [Leech][44XSwmYqS2e2kBEFc8O7RQ][ip-10-70-113-92][inet[/10.70.113.92:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2015-06-23 20:36:40,322][INFO ][cluster.service ] [Leech] added {[Anvil][uX49t4y7RlWYOanQb3t2yA][ip-10-70-104-157][inet[/10.70.104.157:9300]]{master=true},}, reason: zen-disco-receive(join from node[[Anvil][uX49t4y7RlWYOanQb3t2yA][ip-10-70-104-157][inet[/10.70.104.157:9300]]{master=true}])

It was the master node that failed, maybe the flush was too hard on the cluster state?

Harlin_ES · June 24, 2015, 4:56pm

So when we do a GET on the synced flush it looks like there are some pending operations:

"failures": [
{"shard": 22,"reason": "pending operations","routing": {"state": "STARTED","primary": false,"node": "44XSwmYqS2e2kBEFc8O7RQ","relocating_node": null,"shard": 22,"index": "whois-current-v3"}},
{"shard": 39,"reason": "pending operations","routing": {"state": "STARTED","primary": false,"node": "qjQLhrV4QGW69gje_bbEgg","relocating_node": null,"shard": 39,"index": "index"}},
{"shard": 58,"reason": "pending operations","routing": {"state": "STARTED","primary": true,"node": "44XSwmYqS2e2kBEFc8O7RQ","relocating_node": null,"shard": 58,"index": "index"}},
{"shard": 28,"reason": "pending operations","routing": {"state": "STARTED","primary": false,"node": "6hYvIZ9PQ4ah9bgPeYjNOA","relocating_node": null,"shard": 28,"index": "index"}}]},

This is slightly confusing though since we haven't touched the cluster in terms of indexing for several days and there is no balancing/relocating happening

warkolm · June 24, 2015, 10:39pm

What version are you on?

Harlin_ES · June 24, 2015, 11:35pm

We are on ES 1.6.

nik9000 · July 8, 2015, 3:29pm

Looks like we're also seeing some pending operations when we don't think we have any.

nik9000 · July 8, 2015, 3:50pm

This is our beta cluster which has always been a bit weird.

And have some output format:

manybubbles@deployment-bastion:/srv/mediawiki$ curl -s deployment-elastic05:9200/_flush/synced | python -mjson.tool | tee synced_flush
{
    "_shards": {
        "failed": 25, 
        "successful": 740, 
        "total": 765
    }, 
    "commonswiki_content_first": {
        "failed": 0, 
        "successful": 3, 
        "total": 3
    }, 
    "commonswiki_file_first": {
        "failed": 9, 
        "failures": [
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "x0CoVTqaTTqj1IuoLIB0LQ", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 3, 
                    "state": "STARTED"
                }, 
                "shard": 3
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "RWFa4RqbQjG7xmP95l_b0Q", 
                    "primary": true, 
                    "relocating_node": null, 
                    "shard": 3, 
                    "state": "STARTED"
                }, 
                "shard": 3
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "-qPd7LtcQRaszihtRgytWA", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 1, 
                    "state": "STARTED"
                }, 
                "shard": 1
            }, 

            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "x0CoVTqaTTqj1IuoLIB0LQ", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 1, 
                    "state": "STARTED"
                }, 
                "shard": 1
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "Ro1ZcPd9SVetETtEgl8Ceg", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 1, 
                    "state": "STARTED"
                }, 
                "shard": 1
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "RWFa4RqbQjG7xmP95l_b0Q", 
                    "primary": true, 
                    "relocating_node": null, 
                    "shard": 1, 
                    "state": "STARTED"
                }, 
                "shard": 1
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "-qPd7LtcQRaszihtRgytWA", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 6, 
                    "state": "STARTED"
                }, 
                "shard": 6
            }, 

            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "RWFa4RqbQjG7xmP95l_b0Q", 
                    "primary": true, 
                    "relocating_node": null, 
                    "shard": 6, 
                    "state": "STARTED"
                }, 
                "shard": 6
            }, 
            {
                "reason": "pending operations", 
                "routing": {
                    "index": "commonswiki_file_first", 
                    "node": "Ro1ZcPd9SVetETtEgl8Ceg", 
                    "primary": false, 
                    "relocating_node": null, 
                    "shard": 6, 
                    "state": "STARTED"
                }, 
                "shard": 6
            }
        ], 
        "successful": 19, 
        "total": 28
    }, 
...

We've totally isolated the cluster:

manybubbles@deployment-bastion:/srv/mediawiki$ curl 'deployment-elastic05:9200/_cat/thread_pool?v&h=host,optimize.active,merge.active,warmer.active,suggest.active,snapshot.active,search.active,refresh.active,percolate.active,management.active,index.active,get.active,generic.active,flush.active,bulk.active'
host                 optimize.active merge.active warmer.active suggest.active snapshot.active search.active refresh.active percolate.active management.active index.active get.active generic.active flush.active bulk.active 
deployment-elastic05               0            0             0              0               0             0              0                0                 1            0          0              0            0           0 
deployment-elastic08               0            0             0              0               0             0              0                0                 1            0          0              0            0           0 
deployment-elastic06               0            0             0              0               0             0              0                0                 1            0          0              0            0           0 
deployment-elastic07               0            0             0              0               0             0              0                0                 1            0          0              0            0           0

But maybe I have a hint:

watch curl -s \''deployment-elastic05:9200/_cat/thread_pool?v&h=host,management.completed'\'
Every 2.0s: curl -s 'deployment-elastic05:9200/_cat/thread_pool?v&h=host,management.completed'                                                                                       Wed Jul  8 15:49:46 2015

host                 management.completed
deployment-elastic05              2519025   <--- These number keep going up
deployment-elastic08              2479017
deployment-elastic06              2506622
deployment-elastic07              2517234

Harlin_ES · July 8, 2015, 4:07pm

Yes! This is identical to what happened to us. The cluster was completely inactive yet was still claiming to have pending operations. No idea what is causing this.

nik9000 · July 8, 2015, 4:11pm

I will know. Soon.

nik9000 · July 8, 2015, 7:30pm

Well, not that soon. I've gotten pretty deep and I'm still not sure. Turning on index logging is useful but it doesn't tell me why the IndexWriter thinks there are pending actions. I'll keep looking. Still not sure if its actually a bug or what.

Harlin_ES · July 8, 2015, 9:05pm

Yeah, I'm starting to lean towards this being a possible bug. Thanks for digging into this!

nik9000 · July 8, 2015, 9:10pm

I'll look again in the morning.

s1monw · July 9, 2015, 8:29am

hey guys,

I think I know what's going on here... lemme explain a bit how this works:

when a synced flush is scheduled or triggered via a get we do:

flush the engine if its needed (this is determined by a boolean outside called "flushNeeded" of IndexWriter in 1.x)
do the synced flush where we check the indexwriter if there are "pending_changes" (using "IndexWriter#hasUncommittedChanges()"

What can happen if you stop indexing is that we flush the documents and set "flushNeeded = false" now a merge kicks off that caused "IndexWriter#hasUncommittedChanges()" to return true from now on. (note "flushNeeded" is still false since it's maintained outside of indexwriter)

Now you kick off a synced flush and it will not "clean" the indexwriter change count since "flushNeeded" but the second step of synced flush fails since it has not been flushed. I totally see this being a bug though. Yet, I'd like to figure out if that is actually the case so if you send a force flush and then a sync flush it should not give you these errors? Can you run this:

POST /${your_index}/_flush?force=true&wait_if_ongoing=true

And after that:

POST  /${your_index}/_flush/synced

And paste the responses here?

nik9000 · July 9, 2015, 12:33pm

Yup. That did it:

manybubbles@deployment-bastion:~$ curl -s deployment-elastic05:9200/_flush/synced?pretty | tee synced_flush_pre
{
  "_shards" : {
    "total" : 765,
    "successful" : 740,
    "failed" : 25
  },
...
# OK problem still exists
manybubbles@deployment-bastion:~$ curl -s 'deployment-elastic05:9200/_flush?pretty&force&wait_if_ongoing' | tee flush_forced
{
  "error" : "ClusterBlockException[blocked by: [FORBIDDEN/5/index read-only (api)];]",
  "status" : 403
}
manybubbles@deployment-bastion:~$ curl -XPUT deployment-elastic05:9200/apifeatureusage-2015.07.08/_settings?pretty -d '{
>   "index.blocks.read_only" : false
> }'
{
  "acknowledged" : true
}
manybubbles@deployment-bastion:~$ curl -s 'deployment-elastic05:9200/_flush?pretty&force&wait_if_ongoing' | tee flush_forced
{
  "_shards" : {
    "total" : 765,
    "successful" : 765,
    "failed" : 0
  }
}
manybubbles@deployment-bastion:~$ head synced_flush_post 
{
  "_shards" : {
    "total" : 765,
    "successful" : 765,
    "failed" : 0
  },
...
# That did it!

nik9000 · July 9, 2015, 1:23pm

Victory! I just did a rolling restart in our beta environment with synced flush. It seemed to work. With this workaround we can totally upgrade production! Hurray. Only one more slow restart!

s1monw · July 9, 2015, 6:44pm

awesome! thanks for reporting back. It will be fixed in the soon coming 1.6.1 release

Topic		Replies	Views
_flush/synced command shows failures after being clean - indexing stopped Elasticsearch	3	500	July 6, 2021
Failed to execute `flush synced` Elasticsearch	1	338	October 8, 2020
Flush sync failed during upgrade process Elasticsearch	2	692	March 20, 2018
Node flapping between creating node and discarding it Elasticsearch	3	686	July 6, 2017
Losing data if I don't flush Elasticsearch	5	368	July 6, 2017

Synced Flush Causes Node to Restart

Related topics