Set up elasticsearch 2.1 3 nodes 5 shards cluster in ec2 aws using m4.large (8G) instance. I am bulk indexing using java transport client . The same index run in my local mac takes about 6 minutes to index 500K documents but in aws ec2, it takes about 40 minutes. I checked cpu, jvm, threadpool, gc, merge, all looks all good on the ec2 instance.
I don't have a ton of knowledge about the storage in ec2. I am using gp2 with 120 iops. I timed the app, the time is mainly spending on writing in index server. Is this gp2 too slow?
So the issue is either on networking from my appserver to elasticsearch server or the inside of the elasticsearch. I set the replicas to 0 now to simple the issue. refresh_interval": "30s", translog.durablity is async. Flush.interval is 30s.
Anybody has the same issue before? I was wondering if this is the provision issue or my configuration on elasticsearch side.
host ip heap.percent ram.percent load node.role master name
10.40.38.111 10.40.38.111 45 65 0.00 d m ip-10-40-38-111
10.40.37.36 10.40.37.36 9 67 0.00 d * ip-10-40-37-36
10.40.38.213 10.40.38.213 65 67 0.01 d m ip-10-40-38-213