Problem: The ES instance require lot of memory for quick bulk updates. Otherwise its underutilized.
Can i create 2 ES instances with the same shared data directory. Say one with 4Gb(searching) and the other with 32Gb(indexing). and only run one of them at a time.
I could conditionally turn on the 64Gb one during indexing and replace it with the 4Gb one once its done
What you describe is effectively running a single node with two different configs, and you are effectively asking whether you can change the config of a node. Yes, you can do that.
Not really, no. If you only have one data directory then you only have one node, since a node is defined by the contents of its data directory. Everything else is config that is under your control.
Is it possible for 2 elasticsearch processes (running on different containers) to share the same data directory. Assuming only one them runs at a time.
I will experiment again with other techniques and approaches to try increase my indexing speed. But if nothing else works. do you think the approach i mentioned above(shared data dir) would be a good solution for my quick indexing problem. Do you see any immediate issues that could happen with this approach.
I haven't seen this way mentioned in any of the blogs or tutorials.
IMO the main drawback is just the downtime: it's not normally acceptable to restart things in between the indexing and searching phases, because people mostly want to search the data as it arrives. Furthermore if you want to do some more indexing then you must restart the node again, breaking any ongoing searches.
It's more usual to have a hot/warm setup, often with ILM to move indices from the hot tier to the warm tier and do various other optimisations like force-merging them at the same time. That way you can scale the hot (indexing) and warm (search) tiers independently and can scale the hot tier down (all the way to zero if needed) when there's no indexing taking place, without affecting the warm tier.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.