ES 6.4.1: Almost empty cluster doesn't perform as expected


(Marten) #1

Hi,

I've set up a cluster consisting of 3 nodes.
Each node has 128 GB RAM of which 30 GB is allocated by ES, so 90 GB in total.
Disks are SSD.
I have an index of about 5 mil documents, 2 primary shards and 2 replicas.
The cluster is running with approx 17% of the JVM Memory being used.
CPU usage is approx 1%.

When I run a simple "match_all" query on the index with "size":0, I get responsetimes of approx. 40-50 ms (based on the "took" value in the response).
Considering the total capacity of my cluster, I expected much lower responsetimes (less than 10 ms).

If I use the "profile" API I even get responsetimes of approx. 250 ms.

Can anybody help me getting a better performance?

Thanks,

Marten


(Christian Dahlqvist) #2

What does your match all query look like? What does your data look like? How are you measuring the response time?


(Marten) #3

Hi Christian,

Thanks for your quick response.
This is my query:

GET indexname/_search
{
"size": 0,
"query": {
"match_all": {}
}
}

The index contains structured data from a database, no regular documents like ingested pdf's, word etc.
It has approx 150 fields, 2 of them are simple nested fields.
Measuring the responsetime is based on the "took" value from the query response.
I'm using Kibana for testing the queries.

Best regards,

Marten


(Marten) #4

Hi there,

Is there anybody who has an idea how I can improve my performance?
I'm replacing a 7 year old Google Search Appliance for 3 dedicated Elasticsearch servers but my responsetimes have doubled compared to the GSA.

Any idea where I have to look?

Thanks,


(Marten) #5

@Christian_Dahlqvist

Hi Christian,

Do you need more information or do you have an idea?
I've build a simple script to measure average performance on 1000 queries.

GET indexname/_search
{
"size": 0,
"query": {
"match_all": {}
}
}

gives an average response time 0f 0.5 ms

GET indexname/_search
{
"size": 1,
"query": {
"match_all": {}
}
}

gives an average responsetime of 30 ms

GET indexname/_search
{
"size": 10,
"query": {
"match_all": {}
}
}

gives an average responsetime of 30.5 ms
Querying with size 100 gives an average responsetime of 35 ms.

I have created a new index without nested fields, removed the synonyms.txt filter, force merged to 1 segment etc., all with no effect on the performance.
I'm the only one using this cluster, no additional software or services are running on this cluster.

It seems that fetching the results, even for just 1 hit, kills the search performance.
I can't find anything on the in the documentation.
Do you have any idea where I can start to look?


(Christian Dahlqvist) #6

What type of real queries do you have and what is the latency requirement for each type?


(Marten) #7

Hi @Christian_Dahlqvist,

We're firing 2 queries using _msearch

This is our 1st query, index = 500 records:

> GET index5/_search
> {
>   "query":{
>     "bool":{
>       "filter":[
>         {"term":{"key":{"value":"searchterm","boost":1.0}}}
>         ],
>         "adjust_pure_negative":true,
>         "boost":1.0
>       }
>     }
> }

This is our 2nd query, index = 5 mil records:

> GET index1,index2,index3/_search
> {
>   "from": 0,
>   "size": 10,
>   "query": {
>     "bool": {
>       "must": [
>         {
>           "multi_match": {
>             "query": "searchterm",
>             "fields": [
>               "field1^1.0",
>               "field2^1.0",
>               "field3^1.0",
>               "field4^1.0",
>               "field5^1.0",
>               "field6^1.0",
>               "field7^1.0",
>               "field8^1.0",
>               "field9^1.0",
>               "field10^1.0",
>               "field11^1.0",
>               "field12^1.0",
>               "field13^1.0",
>               "field14^1.0",
>               "field15^1.0",
>               "field16^1.0",
>               "field17^2.0",
>               "field18^1.0",
>               "field19^1.0",
>               "field20^1.0",
>               "field21^1.0",
>               "field22^1.0",
>               "field23^1.0",
>               "field24^1.0",
>               "field25^1.0",
>               "field26^1.0",
>               "field27^1.0",
>               "field28^1.0",
>               "field29^1.0",
>               "field30^1.0",
>               "field31^1.0",
>               "field32^1.0",
>               "field33^1.0",
>               "field34^1.0",
>               "field35^1.0",
>               "field36^1.0"
>             ],
>             "type": "cross_fields",
>             "operator": "AND",
>             "slop": 0,
>             "prefix_length": 0,
>             "max_expansions": 50,
>             "zero_terms_query": "NONE",
>             "auto_generate_synonyms_phrase_query": true,
>             "fuzzy_transpositions": true,
>             "boost": 1.0
>           }
>         }
>       ],
>       "filter": [
>         {
>           "term": {
>             "field40": {
>               "value": "value1",
>               "boost": 1.0
>             }
>           }
>         }
>       ],
>       "should": [
>         {
>           "term": {
>             "field41": {
>               "value": "true",
>               "boost": 2.0
>             }
>           }
>         },
>         {
>           "term": {
>             "fiel42": {
>               "value": "true",
>               "boost": 2.0
>             }
>           }
>         },
>         {
>           "term": {
>             "field43": {
>               "value": "true",
>               "boost": 2.0
>             }
>           }
>         },
>         {
>           "term": {
>             "field44": {
>               "value": "true",
>               "boost": 3.0
>             }
>           }
>         }
>       ],
>       "adjust_pure_negative": true,
>       "boost": 1.0
>     }
>   },
>   "indices_boost": [
>     {
>       "index1": 3.0
>     }
>   ],
>   "highlight": {
>     "pre_tags": [
>       "\u003cb\u003e"
>     ],
>     "post_tags": [
>       "\u003c/b\u003e"
>     ],
>     "fragment_size": 0,
>     "fields": {
>       "field1": {},
>       "field2": {},
>       "field3": {},
>       "field4": {},
>       "field5": {},
>       "field6": {},
>       "field7": {},
>       "field8": {},
>       "field9": {},
>       "field10": {},
>       "field11": {},
>       "field12": {},
>       "field13": {},
>       "field14": {},
>       "field15": {},
>       "field16": {},
>       "field17": {},
>       "field18": {},
>       "field19": {},
>       "field20": {},
>       "field21": {},
>       "field22": {},
>       "field23": {},
>       "field24": {},
>       "field25": {},
>       "field26": {},
>       "field27": {},
>       "field28": {},
>       "field29": {},
>       "field30": {},
>       "field31": {},
>       "field32": {},
>       "field33": {},
>       "field34": {},
>       "field35": {}
>     }
>   }
> } 

90% of our queries are like the above one.
We need responsetimes with an average of less than 10ms (which is the same as the current search solution which we are replacing).

Marten


(Marten) #8

Is there anybody who has an idea why I get bad responsetimes?

Responsetimes are reasonable up to 40 queries per second but with 50 queries per second I get responsetimes up to 3500 ms.
Would it help to combine some fields to 1 field and thus reduce the number of queried fields in the multi_match query?

I have already increased the number of shards to 7 with 2 replicas, but this didn't do very much for the performance.
I have removed all nested fields also.

All nodes are running with <50% heap used and 0% CPU used.

I've run out of ideas.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.