Transport vs Node client for large bulk inserts?

datvikash · June 15, 2016, 3:22pm

I am trying to determine which would be a better fit for a large bulk upload ( ~ 1 trillion items for a single index). I have tried with the http api, but its very slow and painful (it has taken a week and only inserted 112 billion items sofar). I imagine I would see a performance boost from using one of the native connectors. Which connector, Transport or Node, would give me the great performance and parallelism?

Appreciate the help.

warkolm · June 16, 2016, 3:55am

This might be better in the ES category rather than the hadoop one, as it seems more general?

datvikash · June 16, 2016, 4:10pm

thanks. I'll try posting in the ES category.

warkolm · June 16, 2016, 8:58pm

You can move threads, just edit the topic and change the category

Topic		Replies	Views
Transport vs Node client for large (billion +) bulk inserts? Elasticsearch	6	897	July 5, 2017
[HADOOP] Anyone used TransportClient for writing to ES from Hadoop mappers? Elasticsearch	3	432	July 6, 2017
Alternative bulk indexing implementations? Elasticsearch	10	2285	July 5, 2017
Elasticsearch node client Elasticsearch	2	324	July 6, 2017
Can we create a node-client with ES/Hadoop or Transport client is the only way out? Elasticsearch es-hadoop	6	1222	July 6, 2017

Transport vs Node client for large bulk inserts?

Related topics