ElasticSearch linear scaling problem

edeak · May 5, 2016, 9:50am

Hi guys,

I'm trying to figure out our basic scale unit based on this article: https://www.elastic.co/guide/en/elasticsearch/guide/current/scale.html

Hardware is fix, and I use one primary shard without any replica.
I measure the average response time with JMeter and my goal is to keep it around 500 ms for 50 parallel users.

The problem:

for 20.000 documents response time is 500 ms
for 100.000 documents response time is around 1 sec
for 7 million documents response time is 1.5 sec

This is not a linear scaling and 20.000 documents seems a very low amount for my needs.
Could you please help me, what can be the problem? How can I figure out my basic scale unit?

Thanks

ddorian43 · May 5, 2016, 9:53am

Note, it depends on the type of query. Example: if your query is matching all documents, it will have to SCORE them all, which will be slow since there are alot of them.

So add the mapping, query, hardware, number of documents matched, query-profiling output.

edeak · May 5, 2016, 10:09am

Mapping:

hunspell analyzed fields (5, different length, some fields can be really really long)

Hardware: 4 CPU, 16 GB RAM, 40 GB HDD

Query:
Queries are dynamically created by JMeter, it has two variable parameter:

keywords (different number of keywords with different length)
date parameter

An example query:

{ "query": { "filtered": { "query": { "bool": { "should": [ "multi_match": { "query": "KEYWORD1 KEYWORD2 KEYWORD3", "type": "most_fields", "fields": [ "field1", "field2", "field3", "field4", "field5"] } ] } }, "filter": { "range": { "createDate": { "gte": "2010-11-12 00:00:01", "lte": "2010-11-13 23:59:59", "format": "yyyy-MM-dd HH:mm:ss" }, "_cache": true } } } } }

Topic		Replies	Views
Is this a right way to do performance evaluation? Elasticsearch	15	2576	July 5, 2017
Linear Scaling with ES Elasticsearch	2	369	July 6, 2017
ES does not scale with Rally track http_logs Elasticsearch rally	2	749	June 27, 2018
What would be the lowest-cost, highest-impact change I can make to decrease response times? Elasticsearch	12	772	July 5, 2017
Slow queries during users peak Elasticsearch	9	532	July 27, 2018

ElasticSearch linear scaling problem

Related topics