Okay. I've got a super weird solution to this problem and I really think
it does not make any sense.
After running jstack with the pid of ES, I found that the java process
blocked at a native method lock0 as the following stack log shows.
"main" #1 prio=5 os_prio=0 tid=0x000000000238f000 nid=0x63ef runnable
[0x000000004110e000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.lock0(Native Method)
at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:90)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1067)
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1155)
at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:169)
- locked <0x00000000c10e2898> (a org.apache.lucene.store.NativeFSLock)
at org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:83)
at
org.elasticsearch.node.internal.InternalNode.(InternalNode.java:157)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:70)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:203)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Then I just ran pstack with the pid. While I got nothing usefully, the
blocked process magically continued and successfully obtained the file lock
and recovered index.
I believe cause of the problem is that the configuration of file locking on
the NAS/NFS is inappropriate. Because I don't have the privilege to
re-configure it, I'm just going to use this solution but still need some
explanation...
On Friday, December 5, 2014 4:41:19 PM UTC-5, Yingkai Gao wrote:
I'm using Elasticsearch-1.4.0 on CentOS-5.6. It works well if I set the
index path.data on local/NFS. However, if I set the path.data on a NAS
folder, the node keep stuck after initializing and plugins loading.
It looks like this link
http://elasticsearch-users.115913.n3.nabble.com/ElasticSearch-fails-on-NFS-makes-tons-of-empty-directories-in-nodes-td3765236.html
a lot, but I'm using NAS. The node did create the index directories on the
NAS path, but it just stopped there.
The starting log of Elasticsearch is:
[2014-12-05 16:36:12,745][INFO ][node ] [kyle]
version[1.4.0], pid[4819], build[bc94bd8/2014-11-05T14:26:12Z]
[2014-12-05 16:36:12,747][INFO ][node ] [kyle]
initializing ...
[2014-12-05 16:36:12,755][INFO ][plugins ] [kyle] loaded
, sites
Using command df, the file system information of the mounted NAS is:
nas-2-25:/exports/volume02
Anyone has any idea how to fix this problem. I know it is not suggested
to use NAS for index, but I have to because the infrastructure problem of
our cluster.
Thanks,
Kyle
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cb26e818-a0fa-44e3-8552-2532a951829e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.