Bulk Index from Remote Host

TB1 · April 20, 2015, 10:40pm

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/479edffe-e780-4858-b093-676b1837d668%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · April 21, 2015, 5:27am

That's fine but you need to split your bulk into smaller bulk requests.
Don't send a 10gb bulk in one call!

David

Le 21 avr. 2015 à 00:40, TB txindfun@gmail.com a écrit :

We are planning to bulk insert about 10 Gig data ,however we are being forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch out for?

any advice would be great

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/479edffe-e780-4858-b093-676b1837d668%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65B2211D-B9EE-42B4-B522-4B21624B47AF%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Christopher_Blasnik · April 21, 2015, 8:20am

Hi,

The best way to approach this is to restrict the size of your bulk request
and / or the number of documents for each request.

I tend to do both, the best sizes seem to be in the 5 to 10 MiB range,
however, I also restrict (which isn't really necessary) the max number of
documents (e.g. 5000) for one request.

You should check out this link from the documentation:

Chris

On Tuesday, 21 April 2015 00:40:43 UTC+2, TB wrote:

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/93802357-b65f-4a85-8a86-58c6a8fa7cfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

TB1 · April 21, 2015, 5:00pm

David and Christopher, thanks for your advice, i did split the files into
12 MB chunks,which was found to be optimum after testing various sizes.
I wanted draw from your experience of potential issues, w.r.to bulk
indexing from local vs bulk indexing from remote.
i did choose to bulk index locally

On Monday, April 20, 2015 at 5:40:43 PM UTC-5, TB wrote:

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a144d5f5-4a2c-46ff-9709-061e2a7bfe6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch bulk index Elasticsearch	5	334	July 6, 2017
Bulk API does not work for huge data! Elasticsearch	4	1144	July 6, 2017
ElasticSearch for the huge data (around 950 million) indexing from oracle Elasticsearch	5	4109	July 6, 2017
Bulk index request dataloss Elasticsearch	6	1131	July 6, 2017
Simple persistence question Elasticsearch	2	273	July 6, 2017

Bulk Index from Remote Host

any advice would be great

Related topics