Elastic cluster data nodes increased search latency

prabhakaran_ss · April 15, 2024, 5:58pm

We've provisioned an Elastic cluster within the AWS OpenSearch service, and we have a single major index with replication set to 3.

Recently, we've noticed a significant spike in search request serving times, sometimes reaching up to 2 seconds. Upon investigation, we found that a couple of data nodes were experiencing increased search latency. After restarting these nodes, they returned to normal behavior.

Here are a few observations regarding the affected data nodes:

They also experienced increased JVMGCYoungCollectionCount and JVMGCYoungCollectionTime.
These data nodes utilize AWS EBS gp2 (SSD) disks, and it appears they were running out of IOPS credits available from EBS.
Although restarting the data nodes resolved the issue and they began to function normally again, we're puzzled about how a restart could have addressed the underlying problem. It's worth noting that the ES cluster continued to serve the same traffic during this time.

ES version: 7.1

system · April 15, 2024, 5:58pm

OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance. See What is OpenSearch and the OpenSearch Dashboard? | Elastic for more details.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns )

leandrojmp · April 15, 2024, 6:11pm

Hello and Welcome,

If you are using Opensearch, then you need to ask this on an Opensearch forum.

Opensearch has custom code done mostly by AWS.

prabhakaran_ss · April 22, 2024, 4:51pm

our cluster's engine is Elasticsearch7.1 and is being hosted on AWS managed service (ie: opensearch service), that's the reason i have posted it here.

Christian_Dahlqvist · April 22, 2024, 5:32pm

AWS as far as i know run Elasticsearch with custom plugins, so it is not standard Elasticsearch.

This type of storage can have very limited IOPS, especially if the volumes are small, and can quickly become a bottleneck. They are however able to burst to higher IOPS for a short period of time, so it may be that you have hit this limit and the restart reset the bursting calculation. Upgrading to gp3 storage is probably recommended.

dadoonet · April 22, 2024, 8:04pm

BTW did you look at Cloud by Elastic, also available if needed from AWS Marketplace, Azure Marketplace and Google Cloud Marketplace?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL, Canvas, Maps UI, Alerting and built-in solutions named Observability, Security, Enterprise Search and what is coming next ...

system · May 20, 2024, 8:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch v2.3 disk throughput Throttle Elasticsearch	4	349	September 6, 2023
Elasticsearch high latency Elasticsearch	16	3411	June 8, 2023
Elasticsearch Increased Latency - 1 of 2 Nodes Until Reboot Elasticsearch	1	363	June 11, 2018
High Search Rate spikes for AWS elasticsearch Elasticsearch	7	1568	October 16, 2020
Performance problem because of read IOPS increase Elasticsearch	5	601	March 12, 2024

Elastic cluster data nodes increased search latency

Related topics