The queries for numeric fields are slower after upgraded the cluster from 2.4.5 to 5.6.3

Zzzxf · February 27, 2018, 9:35am

Cluster info

PS: two cluster with same nodes,docs
Elasticsearch version : 2.4.5/5.6.3
JVM version : java8
OS version: Linux CentOS 6
Nodes: 32(data/master)

Problem

The queries for numeric fields are slower after upgrading the cluster from 2.4.5 to 5.6.3. The avg and tp99 response time of 5.x cluster increase almost twice as 2.x.

Query
The field xxx_id is numeric. The query contains 500~1000 random xxx_id.
{ "from": 0, "size": 1000, "timeout": "5000ms", "query": { "query_string": { "query": "xxx_id:(3976321 2681125 3395902 565629 1473422... )" } } }
Result of 5.x
- search result:
  { "took": 1417, "timed_out": false, "_shards": { "total": 4, "successful": 4, "skipped": 0, "failed": 0 }, "hits": { "total": 75350, "max_score": 0, "hits": [...] } }
- profile result:
  { "took": 1417, "timed_out": false, "_shards": { "total": 4, "successful": 4, "skipped": 0, "failed": 0 }, "hits": {...}, "profile": { "shards": [ { "id": "[xxxx][indexxxxx][2]", "searches": [ { "query": [ { "type": "BooleanQuery", "description": "xxx_id:[3976321 TO 3976321] xxx_id:[2681125 TO 2681125] xxx_id:[3395902 TO 3395902] xxx_id:[565629 TO 565629] xxx_id:[1473422 TO 1473422] xxx_id:[1724922 TO 1724922] xxx_id:[1450020 TO 1450020] xxx_id:[1517222 TO 1517222] xxx_id:[1219222 TO 1219222] xxx_id:[3343724 TO 3343724] xxx_id:[1459926 TO 1459926] xxx_id:[2621525 TO 2621525] xxx_id:[3564826 TO 3564826] xxx_id:[2518929 TO 2518929] xxx_id:[3763028 TO 3763028] xxx_id:[2601623 TO 2601623] xxx_id:[3964521 TO 3964521] xxx_id:[59028 TO 59028] xxx_id:[3697803 TO 3697803] xxx_id:[2287327 TO 2287327] xxx_id:[4192624 TO 4192624] xxx_id:[2455921 TO 2455921] xxx_id:[1652523 TO 1652523] xxx_id:[2548624 TO 2548624] xxx_id:[3450626 TO 3450626] xxx_id:[3447323 TO 3447323] ...", "time": "818.2819640ms", "time_in_nanos": 818281964, "breakdown": { "score": 12800967, "build_scorer_count": 42, "match_count": 0, "create_weight": 655340, "next_doc": 14473065, "match": 0, "create_weight_count": 1, "next_doc_count": 19101, "score_count": 18676, "build_scorer": 765710659, "advance": 24604092, "advance_count": 21 }, "children": [ { "type": "", "description": "xxx_id:[3976321 TO 3976321]", "time": "0.8574110000ms", "time_in_nanos": 857411, "breakdown": { "score": 2363, "build_scorer_count": 42, "match_count": 0, "create_weight": 418, "next_doc": 2590, "match": 0, "create_weight_count": 1, "next_doc_count": 11, "score_count": 11, "build_scorer": 824560, "advance": 27394, "advance_count": 21 } } ... ] } ], "rewrite_time": 63431, "collector": [ { "name": "CancellableCollector", "reason": "search_cancelled", "time": "22.17472100ms", "time_in_nanos": 22174721, "children": [ { "name": "SimpleTopScoreDocCollector", "reason": "search_top_hits", "time": "14.49816800ms", "time_in_nanos": 14498168 } ] } ] } ], "aggregations": [] } ... ] } }
- cpu analyze result
  
  image.png2440×1606 585 KB
Result of 2.x
{ "took": 688, "timed_out": false, "_shards": { "total": 4, "successful": 4, "failed": 0 }, "hits": { "total": 75350, "max_score": 0, "hits": [...] } }

Solution?

Is it due to the changing of numeric data-structure in 5.0? Can I reindex the field as keyword to solve?

polyfractal · March 2, 2018, 5:29pm

Basically, yes.

In ES 5.x, numerics use a new datastructure (BKD tree). This allows better compression, faster numeric operations and lower memory usage... but it is not ideal for "point lookups" like a term query. E.g. it is designed for numeric style operations like ranges, but not single value lookups.

If that field is only used for exact-match lookups, you can re-index it as a keyword. Keyword fields are optimized for exact-match lookups and will be a lot faster.

More info: Tune for search speed | Elasticsearch Guide [master] | Elastic

To dive a bit more into technicals, the BKD datastructure doesn't support sorted iteration, so it has to collect all matches, sort the array and then return an iterator to that sorted array (paraphrasing). That process happens during the build_scorer step. This process isn't bad when dealing with numeric ranges since the cost is amortized over all the values that are being iterated over, but can get expensive when asking for a bunch of individual points.

2.x didn't have BKD trees, hence the difference in performance.

ddorian43 · March 3, 2018, 11:50am

Is there any special encoding done to the terms when it sees that they're all numbers ? Kinda like it's done with _id in 6+.

Zzzxf · March 17, 2018, 11:45am

Thank you very much!

system · April 14, 2018, 11:45am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow performance after upgrade from 2.4 Elasticsearch	2	861	May 25, 2018
Performance problems when Upgrading from ElasticSearch 1.7.4 to 5.4.0 Elasticsearch	11	2516	August 22, 2017
Numeric term(and terms) query much slower after 5.6.15 -> 7.1.1 upgrade Elasticsearch	8	740	April 8, 2020
Upgrade from 5.5 to 5.6 Elasticsearch	3	1160	April 28, 2018
Why my search slow? Elasticsearch	2	4950	January 30, 2018

The queries for numeric fields are slower after upgraded the cluster from 2.4.5 to 5.6.3

Cluster info

Problem

Solution?

Related topics