Hi,
During the course "Elasticsearch Engineer II", it is said :
"A search request is distributed to the shards and on each
shard it is performed sequentially over the segments"
Does that means that if tow requests arrive on a shard, the second waits until the first one finishes to request all segments ?
no, this is statement is not about concurrent searches, but how a single search is handled. Two searches coming in at parallel will also be searched in parallel.
Still a single search over a lucene index (which is essentially a shard) will go through each of the segments sequentially.
Hope this clears things up, if not, let's reiterate
A new question in this case, does the number of segments impact the request processing duiration ? If there is less segment does this duration decrease ? If yes is it a good practice to call a forcemerge after mass insertions ?
less segments is usually a faster search, this is why merging happens also in the background. forcemerging mostly makes sense if you have data that gets written once and then never is written again (no deletes, no update, no indexes).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.