Bulk import performance


(milodky) #1

Hi,

I'm importing massive documents into ElasticSearch 1.7.1.

At the beginning everything looked fine but after sometime(roughly 1b data have been imported). The bulk indexing performance get worse.

Then I changed the refresh interval to -1, set the indices throttle policy to none and increased the bulk queue size, but these do not help too much.

One thing I do notice is that compared with the disk IO write, the disk IO read is quite high:

ID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
5143 be/4 elastics 9.53 M/s 43.38 K/s 0.00 % 90.59 % java -Xms15g -Xmx15g -Djava.awt.headless=true -XX:+UseParN~etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
5145 be/4 elastics 12.39 M/s 39.43 K/s 0.00 % 90.35 % java -Xms15g -Xmx15g -Djava.awt.headless=true -XX:+UseParN~etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
5144 be/4 elastics 12.08 M/s 35.49 K/s 0.00 % 90.26 % java -Xms15g -Xmx15g -Djava.awt.headless=true -XX:+UseParN~etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch
5146 be/4 elastics 14.60 M/s 55.21 K/s 0.00 % 90.04 % java -Xms15g -Xmx15g -Djava.awt.headless=true -XX:+UseParN~etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch

Is there anything else I can do to improve the situation?

Thanks,

Tim


(system) #2