Term query is very slow if size != 0

I have a ES 6.2.4 cluster with 3 x m4.2xlarge (EBS optimized, gp2) nodes on AWS. The cluster contains a single index with 5 shards, 1 replica, 600 000 000 small documents (1k each, 40 string and numeric fields, default mapping) and a total pri size of around 80gb.

When i execute a single simple match all search request the took time reported is around 420 ms. When i execute a single term filter query (which results half of all documents) the took time reported is 1020 ms. That appears to be very slow. If i execute a few requests in parallel all cpus go up to nearly 100% and the took time raises to something like 20000 ms.

Now setting size=0 (instead of 10 which is the default) dramatically improves the situation. Took time is now around 1-2 ms but of course i do not get any search hits (which is what i need).

Is there anything what i am doing wrong? 1s for a single simple term filter query appears to be very very slow.

This query take around 1100 ms (yielding 300 000 000 hits)

_search?size=10
{
  "profile": false,
  "_source": false,
  "query": {
		"constant_score" : {
			"filter" : {
				"term" : { "x" : "y"}
			}
		}
	}
}

With profiling enabled i get something like

"profile" : {
	"shards" : [
	  {
		"id" : "[EpwCzDqTRI2NznZMJ92CMQ][mybigindex][1]",
		"searches" : [
		  {
			"query" : [
			  {
				"type" : "ConstantScoreQuery",
				"description" : "ConstantScore(x:y)",
				"time_in_nanos" : 238435912694,
				"breakdown" : {
				  "score" : 58876393511,
				  "build_scorer_count" : 30,
				  "match_count" : 0,
				  "create_weight" : 66144,
				  "next_doc" : 179380734032,
				  "match" : 0,
				  "create_weight_count" : 1,
				  "next_doc_count" : 88927977,
				  "score_count" : 88927962,
				  "build_scorer" : 863037,
				  "advance" : 0,
				  "advance_count" : 0
				},
				"children" : [
				  {
					"type" : "TermQuery",
					"description" : "x:y",
					"time_in_nanos" : 60513164939,
					"breakdown" : {
					  "score" : 0,
					  "build_scorer_count" : 30,
					  "match_count" : 0,
					  "create_weight" : 8900,
					  "next_doc" : 60423532986,
					  "match" : 0,
					  "create_weight_count" : 1,
					  "next_doc_count" : 88927977,
					  "score_count" : 0,
					  "build_scorer" : 695045,
					  "advance" : 0,
					  "advance_count" : 0
					}
				  }
				]
			  }
			],
			"rewrite_time" : 61676,
			"collector" : [
			  {
				"name" : "CancellableCollector",
				"reason" : "search_cancelled",
				"time_in_nanos" : 188403469051,
				"children" : [
				  {
					"name" : "SimpleTopScoreDocCollector",
					"reason" : "search_top_hits",
					"time_in_nanos" : 67272516521
				  }
				]
			  }
			]
		  }
		],
		"aggregations" : [ ]

This looks strange to me because no scoring should happen:

"score" : 58876393511,
"score_count" : 88927962,

The documentation for constant_score states that it

wraps another query and simply returns a constant score equal to the query boost for every document in the filter

Try without the constant_score query, and simply do

{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "x": "y"
        }
      }
    }
  }
}
1 Like

Also, as your cluster only has 3 data nodes, you may want to set the number of shards to 3.

Thx @Magnus_Kessler

I rewrote the query as you suggested and i scaled the cluster up to 5 datanodes. After the shards were relocated i tried the query again but sadly with no improvements.

Assuming that you don't care about the order in which the documents are retrieved, you can sort by _doc order. From the documentation:

_doc has no real use-case besides being the most efficient sort order. So if you don’t care about the order in which documents are returned, then you should sort by _doc.

{
   "sort":"_doc",
   "query":{
	  "bool":{
		 "filter":{
			"term":{
			   "x":"y"
			}
		 }
	  }
   }
}

does unfortunately not yield any improvement

Infos about the segments i pasted here https://pastebin.com/raw/drKeszjP

At some point, given the huge data set and needing to select and return a subset of that data to the client, things are just going to take their time.

Can you split up the huge index over several smaller indices? These can then be searched in parallel. See this blog post for an in detail discussion about the different trade-offs involved in chosing the number and size of shards in Elasticsearch.

Regarding your original question: When you use size=0, Elasticsearch can cache the request, and will respond a lot quicker after a short warmup period.

Kindly thank you for your support

Explicitly using the shard request cache solved the issue (/_search?request_cache=true)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.