For the purpose of dev/debug/test cycle, I'm using an EC2 instance with 7GB memory and HEAP_SIZE=4GB. ulimit shows 350,000. 1 replica and 5 shards.
I have two application codebase - one is our current production version where child documents are held in an array field(nested type not really needed). This production codebase has been running flawlessly for the past 4 months. I am investigating replacing the array field with parent/child because array is proving to be untenable.
The data set used for this investigating is the same - 2 indices - indexA with 280 parents and 10,000 children, indexB with 300 parents and 5,000 children.
The production codebase rebuilt the 2 indices without any issues as expected.
For the most part, new parent/child codebase was ready and testing started this week. For manual debugging, I started building small subsets of indexA and indexB - e.g. 5 parents/10 children each and those built just fine. However, I was unable to build both test indices fully - see error below. Our production corpus has millions of parents and children.
From reading parent/child threads here, I understand the need to load both parent and children ids into memory but how are parent/children documents mapped into file descriptors? Are they?
Anyone else has similar issue? Anyone has deployment of millions of parents/children?
Thanks.
-- error msg ---
org.elasticsearch.index.engine.IndexFailedEngineException: [test_conversations_3][2] Index failed for [conversation#94]
at org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:499)
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:320)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: /opt/lfops/ops/elasticsearch/parts/elasticsearch/data/livefyre/nodes/0/indices/test_conversations_3/2/index/_1h.tvx (Too many open files)
In another thread, Shay did mentioned setting es.max-open-files=true to see the actual maximum. After wandering around trying to find the right file to set this in, I finally figured out it's bin/service/elasticsearch.conf and I set it like so:
In another thread, Shay did mentioned setting es.max-open-files=true to see
the actual maximum. After wandering around trying to find the right file to
set this in, I finally figured out it's bin/service/elasticsearch.conf and I
set it like so:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.