I have two elastic search nodes set up with 30gb of heap size each on a
large box having 128gb of RAM and 3.1Tb of store size. Apart from all the
standard configuration settings in elastic search.yml file I added an
additional setting of
index.store.type:mmapfs
By default Elastic search uses NIODirectory. So I change it to mmapfs and
restarted the cluster with this new setting
The index size and shard configuration are as follows:
index.number_of_shards: 10
index.number_of_replicas:0
Each index size is approx. 30gb and in total there are 432 indices.
The mmap setting works nicely until each node hits 1041 Gb of virtual
memory mark (stats noted down by using top command) . After this the
recovery (assigning the unassigned shards) stops and I get the following
errors in the logs
[2013-11-06 21:50:52,190][WARN ][monitor.jvm ] [node0] [gc][
ConcurrentMarkSweep][575][2] duration [22.1s], collections [2]/[23s], total
[22.1s]/[22.1s], memory [11.6gb]->[11.3gb]/[29.8gb], all_pools {[Code Cache]
[6.1mb]->[6.1mb]/[48mb]}{[Par Eden Space] [210.1mb]->[15.4mb]/[1.4gb]}{[Par
Survivor Space] [191.3mb]->[0b]/[191.3mb]}{[CMS Old Gen] [11.2gb]->[11.3gb
]/[28.1gb]}{[CMS Perm Gen] [30.3mb]->[30.3mb]/[82mb]}
[2013-11-06 21:51:04,303][WARN ][monitor.jvm ] [node0] [gc][
ConcurrentMarkSweep][576][3] duration [12s], collections [1]/[12.1s], total
[12s]/[34.2s], memory [11.3gb]->[11.3gb]/[29.8gb], all_pools {[Code Cache] [
6.1mb]->[6.1mb]/[48mb]}{[Par Eden Space] [15.4mb]->[80.8kb]/[1.4gb]}{[Par
Survivor Space] [0b]->[0b]/[191.3mb]}{[CMS Old Gen] [11.3gb]->[11.3gb]/[
28.1gb]}{[CMS Perm Gen] [30.3mb]->[30.3mb]/[82mb]}
[2013-11-06 21:51:04,310][WARN ][indices.memory ] [node0] failed
to set shard [2013-08-22.00:00][9] index buffer to [4mb]
[2013-11-06 21:51:04,311][WARN ][indices.cluster ] [node0] [2013-03
-09.06:00][5] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [2013-03
-09.06:00][5] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(
IndexShardGatewayService.java:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [
2013-03-09.06:00][5] failed to open reader on writer
at org.elasticsearch.index.engine.robin.RobinEngine.start(
RobinEngine.java:290)
at org.elasticsearch.index.shard.service.InternalIndexShard.
performRecoveryPrepareForTranslog(InternalIndexShard.java:610)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.
recover(LocalIndexShardGateway.java:200)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(
IndexShardGatewayService.java:174)
... 3 more
Caused by: java.io.IOException: Map failed
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:849)
at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:283)
at org.apache.lucene.store.MMapDirectory$MMapIndexInput.<init>(
MMapDirectory.java:228)
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.
java:195)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory
.java:72)
at org.elasticsearch.index.store.Store$StoreDirectory.openInput(
Store.java:454)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(
Lucene41PostingsReader.java:72)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.
fieldsProducer(Lucene41PostingsFormat.java:430)
at org.elasticsearch.index.codec.postingsformat.
BloomFilterPostingsFormat$BloomFilteredFieldsProducer.(
BloomFilterPostingsFormat.java:129)
at org.elasticsearch.index.codec.postingsformat.
BloomFilterPostingsFormat.fieldsProducer(BloomFilterPostingsFormat.java:100)
at org.elasticsearch.index.codec.postingsformat.
ElasticSearch090PostingsFormat.fieldsProducer(ElasticSearch090PostingsFormat
.java:81)
at org.apache.lucene.codecs.perfield.
PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:194)
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.
fieldsProducer(PerFieldPostingsFormat.java:233)
at org.apache.lucene.index.SegmentCoreReaders.<init>(
SegmentCoreReaders.java:127)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:
at org.apache.lucene.index.ReadersAndLiveDocs.getReader(
ReadersAndLiveDocs.java:121)
at org.apache.lucene.index.ReadersAndLiveDocs.getReadOnlyClone(
ReadersAndLiveDocs.java:218)
at org.apache.lucene.index.StandardDirectoryReader.open(
StandardDirectoryReader.java:100)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java
:111)
at org.apache.lucene.search.SearcherManager.<init>(SearcherManager.
java:89)
at org.elasticsearch.index.engine.robin.RobinEngine.
buildSearchManager(RobinEngine.java:1457)
at org.elasticsearch.index.engine.robin.RobinEngine.start(
RobinEngine.java:278)
... 6 more
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:846)
... 28 more
Additional information of the linux machine on which elastic search is
running :
Output of ulimit -a :
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1031971
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1000000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1031971
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ cat /proc/version
Linux version 2.6.32-358.11.1.el6.x86_64 (mockbuild@c6b7.bsys.dev.centos.org
) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Jun 12
03:34:52 UTC 2013
$cat /proc/meminfo
MemTotal: 132112280 kB
MemFree: 443680 kB
Buffers: 1396772 kB
Cached: 60330616 kB
SwapCached: 312 kB
Active: 43277292 kB
Inactive: 18489120 kB
Active(anon): 60172 kB
Inactive(anon): 99900 kB
Active(file): 43217120 kB
Inactive(file): 18389220 kB
Unevictable: 65066160 kB
Mlocked: 481892 kB
SwapTotal: 51511288 kB
SwapFree: 51509156 kB
Dirty: 464 kB
Writeback: 0 kB
AnonPages: 65105120 kB
Mapped: 121808 kB
Shmem: 3496 kB
Slab: 3874984 kB
SReclaimable: 3702696 kB
SUnreclaim: 172288 kB
KernelStack: 11808 kB
PageTables: 133852 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 117567428 kB
Committed_AS: 64937584 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 509920 kB
VmallocChunk: 34290676980 kB
HardwareCorrupted: 0 kB
AnonHugePages: 64514048 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 5056 kB
DirectMap2M: 2045952 kB
DirectMap1G: 132120576 kB
Even if I have set the virtual memory limit to unlimited, still the elastic
search java process cannot go beyond a certain virtual memory
allocation(1041 Gb).
Has anyone faced similar problem with the mmapfs setting? Can anybody
explain why I am getting such IO exceptions in the elastic search logs?
Thanks !!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.