Response time much greater than query time in slowlogs - AWS ES

Hi folks, we are using Amazon Elasticsearch and we're seeing a big discrepancy in response time as measured by the code querying Elasticsearch and the slowlogs produced by Elasticsearch.

We've turned on "slowlogs" and generally see it stating queries taking less than 100ms. However, as measured by our code (running in AWS Lambda), we are seeing the query taking at least 500ms to respond. I've tried submitting queries with profile=true and see the same pattern -- Elasticsearch reports queries taking less than 100ms to run but it takes close to 300-500ms for us to get a result back.

Our system has very light queries (less than 4 per minute). Light indexing (60 index ops per minute). 8% CPU utilization. ~2,617,471 documents using ~5gb of memory.

It is not obvious to me where to look for this overhead.

I've seen this post Elasticsearch 7.2 slow query after update but I'm not sure if our problem is related. We'll probably try fiddlingindex.refresh_intervaland index.search.idle.after this week to see if it makes a different. Though I'd love to get suggestions on what else might be going on here.

Thanks in advance,
John

After explicitly index.refresh_interval to 1s, we started to see much lower response time from the Elasticsearch cluster.

It appears that index.refresh_interval has been overloaded to carry two meanings

  1. Its value defines how the frequency of refresh
  2. The absence of an explicit override to the default defines whether to block search queries or not

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.