java.io.IOException: failed to obtain in-memory shard lock


(Priya Prabhakar) #1

We are intermittently getting the below error in elasticsearch logs. It gets this error after some time, it will be back.

elasticsearch version - "6.3.2"

Real concern in the log

  • marking and sending shard failed due to [failed to create shard]
  • java.io.IOException: failed to obtain in-memory shard lock
  • [70]: failed to obtain shard lock

I have gone through the related blogs which is mentioning this.
so where I need help

  1. Why "shard lock" is happening?
  2. Is there any data loss when this happens
  3. How to track why it is happening. Any command or any suggestion on this so that I get more details on this

(Christian Dahlqvist) #2

What kind of storage/disk are you using for Elasticsearch?


(Priya Prabhakar) #3

We are using RAID0


(Christian Dahlqvist) #4

Local SSDs? Spinning disks?


(Priya Prabhakar) #5

Thanks Christian. It is SSD


(Priya Prabhakar) #6

@Christian_Dahlqvist Let me know if you need any more details


(Christian Dahlqvist) #7

If you have local SSDs I am not sure what could be causing this. What OS and JVM are you using? What does your cluster setup and configuration look like?


(Priya Prabhakar) #8

OS:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"

JVM:
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-2~14.04-b11)
OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)

Cluster setup:
6 Master-
189 datanodes
18 ingest nodes and
3 co-ordinators


(Priya Prabhakar) #9

Hope the above info helps.

Can please share what is the usual scenarios where the chances of "shard lock" are high or can be the root cause. Anything you want to suggest how to analyse or find the root cause or how to tackle "shard lock" situation.

Or any best practice that you want to share to avoid such situation


(Priya Prabhakar) #10

@Christian_Dahlqvist any help in this case?


(Christian Dahlqvist) #11

I have not come across this before, but did look through issues on the Elasticsearch GitHub repo and found this. It does however relate to older versions. It may be worthwhile looking through this and see if any of the potential causes apply to your cluster. If not, I would recommend you open a new issue with as much information about your cluster and work load as possible.


(Priya Prabhakar) #12

Thanks @Christian_Dahlqvist. The error didn't reoccur. Will raise the ticket if find one


(Hamid) #13

Sometimes when your are out of memory elasticsearch is blocked because it can't allocate shards.


(Priya Prabhakar) #14

@hzarrabi for answering. Still unable to locate what caused the issue.


(Hamid) #15

is your elasticsearch responding by http:

http://ipaddress:9200
if yes, do you have any indexes in red status. you can see by sending from any browser to your elastic server :
http://ipaddress:9200/_cat/indices?v&h=host,creation.date.string,index,health,status&s=creation.date

if you have in red, in my case I became to resove it by deleting some red indexes from linux command line :
sudo curl XDELETE http://ipaddress:9200/_cat/<indexe_name>


(system) #16

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.