ES version: 0.90.10
We measured the peak traffic and allocated over-provisioned number of EC2
m1.xlarge instances and made it ready for having traffic.
Immediately after turning on the traffic, whole ES cluster went down with
OOM error. I analyzed heap dump and 6.5GB was full of TransportService,
which means ES server instance was backed up with unhandled requests from
- Client's behavior
There are 500 threads doing bulk request on ES cluster with timeout 2
seconds. I guess 2 second timeout would be reasonable but when I checked
rx/tx graph, the graph showed it got 38GB per second, unbelievable numbers,
look at graphs. Does this mean we shouldn't use timeout in a large cluster?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/368b6a6e-8678-4de6-a7da-ef950c8f2bc8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.