Failed to execute on node [YK3h1SM5SZW2K-JloBdVCg]

I am using curl -H "Content-Type: application/x-ndjson" -XPOST http://192.168.1.102:9200/myindex/mytype/_bulk --data-binary @file command to import json file into ES cluster.
my master nodes contains : [192.168.1.101, 192.168.1.102, 192.168.1.103]
my data nodes contains: [192.168.1.105, 192.168.1.106, 192.168.1.107, 192.168.1.108].
the master log shows here:

failed to excute on node  [YK3h1SM5SZW2K-JloBdVCg]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [data5][192.168.1.108:9300][cluster:monitor/nodes/stats[n]] request_id [19314] timed out after [15001ms]
    at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:982)
...
...
Received response for a request that has timed out, sent [25237ms] ago, timed out [10236] ago, action [cluster: monitor/nodes/stats[n]], node

THE EXCEPTIONS are random appearance, sometimes it can process success and es cluster is GREEN , sometimes some of my data nodes get crashed.

I am not sure it appeared because of low memory ,as I use free -lh to show the memory status ,found only 200MB FREE (total 64GB)

Anybody could give me some hints? :woman:t2:

That looks like monitoring failing to gather statistics about node performance in a timely fashion.
My guess is these data nodes are running hot? Are you seeing GC warnings in their log files?

THANKS FOR YOUR APPLY.
I am sure I can see many gc overhead in all of my nodes log file.
And my curl bulk command process about 300MB json file at one time, should I cut it into more split file?

Yes, that would certainly help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.