[o.e.t.TransportService] Received response for a request that has timed out

EVINDX · July 24, 2023, 8:54am

We are receiving the following error

{ElasticsearchLogger} [o.e.t.TransportService] Received response for a request that has timed out, sent [21.3s/21361ms] ago, timed out [5.6s/5682ms] ago, action [indices:monitor/stats[n]], node  {--XX--}{--XX--}{--XX--}{127.0.0.1}{127.0.0.1:9201}{dim}{xpack.installed=true, transform.node=false}], id [3787]

{ElasticsearchLogger} [o.e.c.InternalClusterInfoService] failed to retrieve shard stats from node [--XX--]: [--XX--][127.0.0.1:9201][indices:monitor/stats[n]] request_id [4022] timed out after [15043ms]

{ElasticsearchLogger} [o.e.t.TransportService] Received response for a request that has timed out, sent [42.2s/42281ms] ago, timed out [27.2s/27238ms] ago, action [indices:monitor/stats[n]], node [{--XX--}{--XX--}{--XX--}{127.0.0.1}{127.0.0.1:9201}{dim}{xpack.installed=true, transform.node=false}], id [4022]

{ElasticsearchLogger} [o.e.m.f.FsHealthService] health check of [--XX--] took [9271ms] which is above the warn threshold of [5s]

{ElasticsearchLogger} [o.e.c.InternalClusterInfoService] failed to retrieve shard stats from node [--XX--]: [--XX--][127.0.0.1:9201][indices:monitor/stats[n]] request_id [4355] timed out after [14925ms]

We tried to increase the JVM heap size from 8GB to 16GB
But still facing the issue.

Need suggestions how we can resolve this issue

dadoonet · July 24, 2023, 8:58am

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v
GET /_cat/indices?v

EVINDX · July 25, 2023, 8:06am

Here are the results:
GET /

GET /_cat/health?v

GET /_cat/indices?v

GET /_cat/nodes?v

dadoonet · July 25, 2023, 8:53am

Please don't post images of text as they are hard to read, may not display correctly for everyone, and are not searchable.

Instead, paste the text and format it with </> icon or pairs of triple backticks (```), and check the preview window to make sure it's properly formatted before posting it. This makes it more likely that your question will receive a useful answer.

It would be great if you could update your post to solve this.

dadoonet · July 25, 2023, 8:55am

What's the heap size?

Could you upgrade your cluster to the latest 7.17 version?

EVINDX · July 25, 2023, 10:18am

What's the heap size?

16 GB

Could you upgrade your cluster to the latest 7.17 version?

It would not be possible to upgrade to newer version

Please do let me know if you have any lead on what might be reason for such exception

dadoonet · July 25, 2023, 10:33am

Do you have another node running on the machine? What else is running there?

EVINDX · July 27, 2023, 7:01am

1. Do you have another node running on the machine?

No.

2. What else is running there?

Enterprise Vault 14.2.2 

ElasticSearch 

Veritas Cluster Service

Symantec Endpoint protection

Veritas NBU Agent 

Microsoft Office 2016 

Axon Agent 

Dell SecureWorks

dadoonet · July 27, 2023, 9:50am

In your logs I can see: [127.0.0.1:9201] but in the screen capture, I can see localhost:9200... So I'm confused... What is running under port 9201?

EVINDX · July 27, 2023, 10:21am

In our case the ports are select randomly either 9200 or 9201 when we restart our service. Its a single node cluster.

dadoonet · July 27, 2023, 12:46pm

I disagree. There's something in the logs of the 9200 which mentions a call from 9201.
Meaning that 2 nodes are running on the same machine.

It's super unclear what you are doing TBH.

EVINDX · July 28, 2023, 8:24am

What is happening in our Environment

The logs that were provided were form different time and the snapshots of GET API are from different time.
So basically what is happening is when we restart are internal services a port is selected randomly through which further communication takes place.
After the logs were provided ours services were restarted, because of the random selection of the ports the port over which the communication is taking place changed to 9200.
If we restart it again it might either change the port to 9200 or 9201.
Its either or.

So lets say current point of time 9200 is selected, The logs will show 9200 and The GET API will be successful only for 9200.
and similarly if 9201 port is select then logs will show 9201 and the GET API will be successful only for 9201

So basically when the logs were collected 9201 was set but because of service restart the port changed and was set to 9200.

I hope this helps.

dadoonet · July 28, 2023, 9:24am

Elasticsearch uses by default 9200 and does not pick a random port.
It uses 9201 if AND ONLY IF something is already running in 9200.

But let's come back to the discussion.

I don't see anything wrong on Elasticsearch side. But Elasticsearch should run alone on a server. All the other services you listed below might steal RAM/CPU/...

Enterprise Vault 14.2.2
Veritas Cluster Service
Symantec Endpoint protection
Veritas NBU Agent
Microsoft Office 2016
Axon Agent
Dell SecureWorks

So if it's a production machine, please remove all the other services.
If it's a dev machine, I think you can ignore the warnings.

EVINDX · July 28, 2023, 1:11pm

Is there a way to disable these warning.
Any registry key or environment variable or config change or anything else that can used to disable these calls

dadoonet · July 28, 2023, 1:12pm

Is it a production machine?

EVINDX · July 31, 2023, 2:24am

Yes, Its a production machine

dadoonet · July 31, 2023, 7:54am

As i said:

Make sure that Elasticsearch starts only on 9200 port and that you never try to start Elasticsearch multiple times on the same machine (which by default is not possible).

system · August 28, 2023, 7:55am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Received response for a request that has timed out and "failed to retrieve stats for node" Elasticsearch	8	774	October 26, 2023
failed to retrieve shard stats from node [zxt4RAOiRZy9Lol9IdIGfg]: [node_2][10.202.152.18:9300][indices:monitor/stats[n]] request_id [77247 683] timed out after [15016ms] Elasticsearch	0	104	April 9, 2024
Received response for a request that has timed out Elasticsearch	1	1725	February 6, 2020
Receive Timeout Transport Exception Error on Elastic nodes Elasticsearch	7	2670	May 13, 2020
[WARN ][o.e.t.TransportService ] [esm3] Received response for a request that has timed out, sent [33470ms] ago, timed out [3470ms] ago, action [internal:discovery/zen/fd/master_ping] Elasticsearch	3	2792	March 23, 2018

[o.e.t.TransportService] Received response for a request that has timed out

Related topics