Hi
Hi all. We are just getting started with elastic and it works great on
our local machines. However, when moving to our test and production
servers on OpenVZ we get a bunch of problems. The one we really can't
get our head around is when elastic suddenly fails to aquire new
locks.
We are running Ubuntu 10.10 on a OpenVC virtual machine. We have 3GB
of memory availible and when starting elasticsearch it consumes about
1,5 of those.
This is our settings:
cluster.name: elasticsearch-kundotest
index.number_of_shards: 1
index.number_of_replicas: 0
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
When we build all our indexes we create about 600 indexes and adds in
total about 30 000 documents. Some indexes are big (3000+ documents)
and some are really small (2-3 documents).
This is our ulimit conif:
root@testserver:/usr/sbin/elasticsearch# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) 16382
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 64000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
We use celery as a queue and pyes to interface elastic so there is
basically one connection created for every document indexed if I
understands things correctly...
The problem we get after running the indexing process for a while is
this:
[2011-12-12 10:38:37,173][INFO ][cluster.metadata ] [Aginar]
[orebro-sk] create_mapping [dialog]
[2011-12-12 10:38:38,152][WARN ][indices.cluster ] [Aginar]
[orebro-sk][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[orebro-sk][0] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by:
org.elasticsearch.index.engine.EngineCreationFailureException: [orebro-
sk][0] Failed to create engine
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
249)
at
org.elasticsearch.index.shard.service.InternalIndexShard.start(InternalIndexShard.java:
267)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:
123)
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:179)
... 3 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock
obtain timed out: NativeFSLock@/usr/sbin/elasticsearch/data/
elasticsearch-kundotest/nodes/0/indices/orebro-sk/0/index/write.lock:
java.io.IOException: No locks available
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1108)
at
org.elasticsearch.index.engine.robin.RobinEngine.createWriter(RobinEngine.java:
1303)
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
247)
... 6 more
Caused by: java.io.IOException: No locks available
at sun.nio.ch.FileChannelImpl.lock0(Native Method)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:924)
at java.nio.channels.FileChannel.tryLock(FileChannel.java:978)
at
org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:
216)
at org.apache.lucene.store.Lock.obtain(Lock.java:95)
... 9 more
[2011-12-12 10:38:38,156][WARN ][cluster.action.shard ] [Aginar]
sending failed shard for [orebro-sk][0], node[50DYl0MPTbOum-IFi8R-ug],
[P], s[INITIALIZING], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[orebro-sk][0] failed recovery];
nested: EngineCreationFailureException[[orebro-sk][0] Failed to create
engine]; nested: LockObtainFailedException[Lock obtain timed out:
NativeFSLock@/usr/sbin/elasticsearch/data/elasticsearch-kundotest/
nodes/0/indices/orebro-sk/0/index/write.lock: java.io.IOException: No
locks available]; nested: IOException[No locks available]; ]]
[2011-12-12 10:38:38,156][WARN ][cluster.action.shard ] [Aginar]
received shard failed for [orebro-sk][0], node[50DYl0MPTbOum-IFi8R-
ug], [P], s[INITIALIZING], reason [Failed to start shard, message
[IndexShardGatewayRecoveryException[[orebro-sk][0] failed recovery];
nested: EngineCreationFailureException[[orebro-sk][0] Failed to create
engine]; nested: LockObtainFailedException[Lock obtain timed out:
NativeFSLock@/usr/sbin/elasticsearch/data/elasticsearch-kundotest/
nodes/0/indices/orebro-sk/0/index/write.lock: java.io.IOException: No
locks available]; nested: IOException[No locks available]; ]]
[2011-12-12 10:38:54,463][INFO ][node ] [Aginar]
{0.18.5}[31853]: stopping ...
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
OpenJDK 64-Bit Server VM warning: Attempt to allocate stack guard
pages failed.
[1]+ Stopped ./bin/elasticsearch -f -Des.max-open-
files=true
ANY ideas on what could be causing this?