Es query slow with simple filter, and profile didn't show it

An ES version of 7.6.0, with 1 million docs , 1.76Gi on size.
I found a query slow with only a simple term filter in it. (about 2s to excute)
The query body is just as below

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "city_id": "4"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 500,
  "_source": false
}

And the response comes here

{
"took": 2078,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": 0,
"hits": [
{
"_index": "_",
"_type": "_doc",
"_id": "JADw3XEBfaJNtOBspQGc",
"_score": 0
}
,
{
"_index": "_",
"_type": "_doc",
"_id": "awAO3nEBfaJNtOBsgUDo",
"_score": 0
}
,
{
"_index": "_",
"_type": "_doc",
"_id": "VQAh3nEBfaJNtOBsbGxJ",
"_score": 0
}
........

When I set profile to true (of cause I'd switched city_id to aviod cache effect). It said only tens of millisecond was spent on query, while actual took is about 2-5 seconds.
Since I set source tag to false that there is no need to fetch source from disk , Where did rest of the time spend on ? What can I do to make such query quicker?

Welcome!

Is that happening again after some calls?
I mean that it's fine to have it slow for the first request but it should be fast on the second call.

Also, what kind of disks do you have? SSD?

Thx for reply. :grinning:

Yeah, it become much faster after first request.

Our disk is normal mechanical. But I'v also tried it on SSD, the improvement is little. The scoring goes in index, and as I don't retrieve source field it should not concern with disk, shouldn't it?

As I known, some search engine will recall around 500 docs for next finer rank. And the whole walkthrough job fininsh in 1 seconds. So there is not much time accepted for first recall step. Besides the time spend is already high as I only use the simplest term filter. The delay must become unbearable if I add more complicated filter condition.

I wonder what did others do while building up multi step ranking, such as es along with learning to rank model.

So that's fine, no?

I mean that in production you probably won't start Elasticsearch, run a query and then shut it down.

No. I don't think so. Unless you do things which can really slow down.

But I'll be happy to look at your slow queries if any.

Do you mean that before we offer service to others we should find and run sufficient query case to warm es up ?

How many cache size should we take for about 2GB docs, if we want to cover almost all query case in cache ?

Another thing
I guess that I might misunderstand some conception. Since I was confused by huge time-spend difference between the profile showing and the real one.(10ms vs 2s)
Although I know that the profile is calculate by sampling, yet the gap is unreasonable.

If you don't want the very first user of your service to pay the price, most likely yes.

It's not exactly caching all queries but let the OS caches files that are frequently read.
Although there is also some cache for filters. Both matter.
Just run some tests by yourself. You'll probably see if you really need or not to implement a warmer or not. My guess is that it is normally not needed.

I don't think that profile is sampling. But if you are running the query with "profile": true after you ran a similar query, then the cache is probably playing a role.

If you want to have a similar response time, you should:

  • stop the node
  • start the node
  • run the 1st query with "profile": true

I had tried many times, and carefully to avoid cache effect by switch id every time.
Here is a typical snatshoot from Kibana

Which says that time spend is only 3.5 milliseconds. As we know, it's far more less than what it really takes.
I wonder what the profile time really represent for?