Hardware is fix, and I use one primary shard without any replica.
I measure the average response time with JMeter and my goal is to keep it around 500 ms for 50 parallel users.
The problem:
for 20.000 documents response time is 500 ms
for 100.000 documents response time is around 1 sec
for 7 million documents response time is 1.5 sec
This is not a linear scaling and 20.000 documents seems a very low amount for my needs.
Could you please help me, what can be the problem? How can I figure out my basic scale unit?
Note, it depends on the type of query. Example: if your query is matching all documents, it will have to SCORE them all, which will be slow since there are alot of them.
So add the mapping, query, hardware, number of documents matched, query-profiling output.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.