LockObtainFailedException in ES

pran · May 2, 2017, 11:47am

Hi,
We have a cluster with 16 nodes. Each node is Data and Master Node. We have ES version 2.4. Recently, we have started getting the following error:
[2017-05-01 23:28:04,221][WARN ][cluster.action.shard ] [servername] [indexname][5] received shard failed for target shard [[indexname][5], node[a
qM0l4oWS7iIXKiBJtmjkQ], [R], v[525], s[INITIALIZING], a[id=P_wPEdfkQ2OMWgZjPs6K8w], unassigned_info[[reason=NODE_LEFT], at[2017-05-01T21:24:56.380Z], details[node_left[
aqM0l4oWS7iIXKiBJtmjkQ]]]], indexUUID [zIkg0_jkTJeH3vUIqEm2Zg], message [failed to create shard], failure [ElasticsearchException[failed to create shard]; nested: LockO
btainFailedException[Can't lock shard [indexname][5], timed out after 5000ms]; ]
[indexname][[indexname][5]] ElasticsearchException[failed to create shard]; nested: LockObtainFailedException[Can't lock shard [indexname][5], timed out after 5000ms];
at org.elasticsearch.index.IndexService.createShard(IndexService.java:389)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:601)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:501)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:166)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Can't lock shard [indexname][5], timed out after 5000ms
at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:609)
at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:537)
at org.elasticsearch.index.IndexService.createShard(IndexService.java:306)
... 10 more

Please help me to understand this error and let me know as what could be the probable reason for this error. Please let me know if you require any more information.
Regards,
Pran

Christian_Dahlqvist · May 2, 2017, 11:56am

What type of file system are you using?

pran · May 2, 2017, 1:41pm

Hi,
Following are server specs:
Virtual - Linux - 16 core 64 RAM - HyperV
16 core 64 GB RAM

Let me know if you require more information
Regards,
Pran

pran · May 4, 2017, 5:26am

Hi Christian,
We are getting this error in production and are not sure as why it is coming. There is no issue with the disk space as per the following stats:
"path": "/mnt/sdb/elasticsearch/abc_production/nodes/0",
"mount": "/mnt/sdb (/dev/sdb)",
"type": "ext3",
"total_in_bytes": 1056894091264,
"free_in_bytes": 905138794496,
"available_in_bytes": 851451703296,
"spins": "true"

Please let me know if you have any clue regarding the aforesaid issue. Do let me know if you require more information about about cluster setting.
Regards,
Pran

Christian_Dahlqvist · May 4, 2017, 7:57am

If I calculate correctly, your disk is just above 85% full, which means it has passed the low watermark. This will impact how Elasticsearch allocates shards, and even though I am not sure it is directly related to your issue, it seems like a strange coincidence. You may want to change the watermark settings or remove some data to see if that affects the issue.

pran · May 5, 2017, 4:37am

Hi Christian,
I think It is not 85% full but it is 85% space available. so I there is no issue with space.
Regards,
Pran

Christian_Dahlqvist · May 5, 2017, 4:45am

You are indeed correct. Not sure how I ended up getting that switched around....

pran · May 5, 2017, 4:50am

Hi Christian,
wanted to share the logs from our cluster so that probably you get better understanding as what is going wrong in our cluster. Please let me know your mailing address so that I can send it to you. (sorry as I am not able to find any option to upload logs here.)
Regards,
Pran

system · June 2, 2017, 5:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What is Shard Lock Obtain Failed Exception? Elasticsearch	1	3141	December 1, 2017
My online cluster frequently suffered from A lot many so sucked LockObtainFailedException Elasticsearch	1	524	February 23, 2018
Initializing and locks Elasticsearch	3	3543	July 6, 2017
ElasticSearch LockObtainFailedException on restoring index from s3 repository Elasticsearch	1	1050	July 5, 2017
java.io.IOException: failed to obtain in-memory shard lock Elasticsearch	15	6227	October 22, 2018

LockObtainFailedException in ES

Related topics