We are currently working on an application using Elastic search and we try to evaluate benefits and constraints on each differents architecture choices related to spreading data on multiple indexes or using a single shared one.
If you use an index with concurrent read/writes is there any penalties if you write too much especially for searches ?
IMHO the question is more about shards than indices. Unless you have time based data in which case both questions (number of indices and shards) applies.
I don't think there is a pre-made answer and you have to test your scenario I believe. A tool like Rally could help for that.
If your cluster is well designed, you should not really see a lot of penalties with a lot of writes.
I've heard about users indexing up to 10m documents per second. Of cours, it's not with only one shard, on one node using HDD drives ...
Could you may be describe a bit more your use case? A typical document?
Are you seeing yet some slowness?
Thanks for your answer. To summarize, in the past our application use many indexes. Some of them have less frequent writes than other ones dedicated to activities like auditing. When we started to implement the application we used to many indices and application consumed too much memory because of too much shards loaded. We have already done a first pass to merge some indices but for now we have merged indices mostly used for read/search. I was wondering what could be the impact if we continue in this way on merging them and mix with data frequently inserted/updated.
I am a beginner with ES so I am not yet comfortable in the way ES manage resources. In any way data pushed size is not big and write frequency not really greater than thousands per seconds but global index size can contain millions of documents.
What are resources mainly consumed during write operation ? CPU ? Memory ? What do you mean by "cluster is well designed" ? Is there any link we or recommendations we could consult ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.