Elasticsearch performs slowly when data size increased

We have a cluster with following details:

  1. OS: Windows 7 (64 bit)
  2. Number of nodes: 2 (i7 processor, 8Gb RAM)
  3. ES version: 2.4.4

We have created an index with following details:

  1. Index size: 86 Gb
  2. Number of shards: 12
  3. Number of replica: None
  4. Number of documents: 140 million
  5. Number of fields: 15
  6. For most of the fields we have set "index": "not_analyzed"
  7. For few of the fields we have set "index": "no"
  8. We are not executing any full-text search, aggregation or sorting
  9. For 2 fields we are using fuzziness of edit distance 1

12 shards are evenly distributed on 2 nodes (6 shards each). We are running multi-search query on this cluster where each multi-search request consist of 6 individual queries.

Our queries are taking too much time to execute. From the "took" field we can see that each individual query is taking time in the range of 3-8 seconds. Rarely they are executing in milliseconds.

Avg. record count returned in result set is around 800 (max 10k records and min 10 records).

When we ran the same test on relatively small set of data (10 million records which is 7 Gb in size) then each individual query took time in the range of 50-200 milliseconds.
Could someone suggest what might be causing our queries to run slow when index size increases?

Why so many shards?

What do these queries look like?

Thank you @warkolm for your response.

So that we can add more nodes later (without the need of reindex) as the data grows.

Following are the 2 sample queries that are used. A total of 6 similar queries are fired in multi-search.

POST /_msearch 
{"index" : "school"} 
{ "query": { "bool" : { "must" : [ { "bool" : { "should" : { "range" : { "marks" : { "from" : "100000000", "to" : "200000000", "include_lower" : true, "include_upper" : true } } } } }, { "nested" : { "query" : { "match" : { "query" : "25 ", "fields" : [ "subject.chapter" ] } }, "path" : "subject" } } ] } } }
{"index" : "school"}
{ "query": { "bool" : { "must" : { "nested" : { "query" : { "match" : { "query" : "A100123", "fields" : [ "student.id" ], "fuzziness" : "1" } }, "path" : "student" } } } } }

Hope this helps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.