I am benchmarking search speed using different parameters.
One specific parameter used is copy_to implementation.
Scenario 1.
Number of Fields : 10000
Documents indexed : 500
Indices : 10
Search on Indices : 10 at once
Search Speed using copy_to field: 20ms -3s
Scenario 2.
Number of Fields : 10000
Documents indexed : 100000
Indices : 10
Search on Indices : 10 at once
Search Speed using copy_to field: 6s - 18s
I suppose the number of documents stored in Elasticsearch will not affect the search speed.
Also, since the use of copy_to field, it will reduce the number of disk seeks during search to one field. Hence even this should not be something to worry about
Why is there a drastic increase in search speed when number of documents increased?
Am I missing something here? Because this is driving me crazy!
Thanks Christian for quick response.
Like I mentioned earlier, this is a benchmarking process.
My query is something of this sort
{
"query": {
"bool": {
"should": [
1. Query String query of exact match
2. Multi_match of type phrase with slop of 1
3. Multi_match with fuzziness
4. Multi_match without fuzziness
],
"minimum_should_match": 1
}
}
}
If you are using query string queries without specifying a field name, be aware that the old _all field has been replaced by an all_fields mode, which can be slow if you have a lot of fields it need to iterate over. Combine this with slop and fuzziness and you could have a very expensive (and slow) query. You may also want to look into creating a custom copy_all field to use by default, much like the old _all field worked.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.