Skip "action_and_metadata" for every bulk index documents via http api & other bulk loading questions

Aditya_Alurkar · November 13, 2013, 7:18pm

Thank you for the recommendations.

I am not saturating the network at all, I am CPU bound on the ES cluster
side during the load phase. These servers are dedicated only for ES and are
currently only responsible for the initial loading of the data.

Unfortunately I do not have the luxury of having solid state, but will look
at storage layer optimizations when that does become the bottleneck.

-Adi
On Wednesday, 13 November 2013 00:15:57 UTC-8, Jörg Prante wrote:

Please use the official python clients.

Monitor also the network interface if you index from a remote host.

You have 15b+ docs and I assume that are some GBs. If the network is
saturated and you can spend CPU cycles, use gzip compression on HTTP bulk.
If you do not feel like using the official client, check if httplib2 is a
better choice over urllib2, it supports compression.

Also check if you use fast JSON encoding on client side. ujson is a fast
drop-in replacement for the slow standard json python lib.

For fast persisting in the cluster, use SSD instead of spindle disks. On
file systems with spindle disk, you should disable the atime (noatime) in
the Linux file system mount on the data dir for better I/O throughput.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Looking for advice on bulk loading Elasticsearch	6	945	July 6, 2017
How to index bulk of documents all at once? Elasticsearch	4	364	July 6, 2017
Indexing multiple things at once. Possible? Elasticsearch	7	435	July 6, 2017
Does the server support streaming? Elasticsearch	10	747	July 6, 2017
Elastic bulk API Multiple docs with single action Elasticsearch	2	511	March 21, 2017

Skip "action_and_metadata" for every bulk index documents via http api & other bulk loading questions

Related topics