Hi,
We are designing a API with an Elasticsearch backend. The data is (in ES terms) very structured. The limit of nesting has been upped to 75 (default is 50).
The indices are also have a lot of like (ngram) analyzers.
To give an idea, one index (called specimen) has 8mln documents. The lucene size is:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open specimen 2IASZLz9RamxGh8iqiIx_Q 12 0 117273767 645507 199.6gb 199.6gb
We have some trouble with the query performance when requesting large sets (for example, a set of 500000 documents).
What we find is that having all the shards of an index on one machine and then querying that same machine gives the best performance. For example, i've used the _routing endpoint to set all shards of the specimen index to 1 server. Then if i query directly to the server which has the specimen shards i get for
specimen/_search\?pretty\=false -d '{"size":"500000"}'
A result in about 18 seconds. The result size i about 800MB (uncompressed json) If i query to a an other server in the same cluster it takes about 81 seconds. If i spread the shards over to multiple servers i also get around the 80 to 90 seconds. Also setting the number of replicas to be more than 0 (tried with 1,2 and 4 repl)
The question
To the question is, is this performance drop because of a complex (deep nested) mapping or is this because of retrieving large document sets?
We are using ES 5.1.2
The current cluster has 3 nodes with 32GB ram (16 heap) 400GB NMVe disks and 8 cores per server. The network connectivity is about 8.5Gbits between the nodes (measured using iperf). (Also tested with 8 machines/200TB/32GB).