Hi!
I've lauched 5 ElasticSeach nodes (each on different host) - 2 non-data
nodes and 3 data nodes. Then I've started pushing to cluster few GB of data.
While doing this I've seen on htop that ES processes are using more and more
memory. After some time I see in logs stuff like that:
[13:34:45,762][WARN ][jgroups.FD ] I was suspected by
ggmail-test-5-49710; ignoring the SUSPECT message and sending back a
HEARTBEAT_ACK
[13:34:46,042][WARN ][jgroups.pbcast.NAKACK ] ggmail-test-3-55691:
dropped message from ggmail-test-5-49710 (not in xmit_table), keys are
[ggmail-test-2-44830, ggmail-test-3-55691, ggmail-test-2-29869,
ggmail-test-4-9592, ggmail-test-1-19291], view=[ggmail-test-1-19291|9]
[ggmail-test-1-19291, ggmail-test-4-9592, ggmail-test-2-29869,
ggmail-test-3-55691, ggmail-test-2-44830]
[13:34:54,373][WARN ][jgroups.pbcast.NAKACK ] ggmail-test-3-55691:
dropped message from ggmail-test-5-49710 (not in xmit_table), keys are
[ggmail-test-2-44830, ggmail-test-3-55691, ggmail-test-2-29869,
ggmail-test-4-9592, ggmail-test-1-19291], view=[ggmail-test-1-19291|9]
[ggmail-test-1-19291, ggmail-test-4-9592, ggmail-test-2-29869,
ggmail-test-3-55691, ggmail-test-2-44830]
and that:
[WARN ][monitor.jvm ] [Winky Man] Long GC collection occurred,
took [10s], breached threshold [10s]
I'm not able to put new data to ES (on existing index and type). Curl
returns:
{
"error" : "PrimaryNotStartedActionException[[mail_index1001][3] Timeout
waiting for [1m]]"
}
I configured in elasticsearch.in.sh Java heap space to 1GB. Bigger values
(like 3GB) changes only time after which this situations happens.
I use configuration on data nodes:
cluster:
name: LocalCluster060
node:
data: true
http:
enabled: false
discovery:
jgroups:
config: tcp
bind_port: 9700
tcpping:
initial_hosts: 10.1.112.46[9700], 10.1.112.58[9700],
10.1.112.47[9700], 10.1.112.59[9700], 10.1.112.214[9700]
What is the reason of problem?