Bulk Index from Remote Host

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/479edffe-e780-4858-b093-676b1837d668%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

That's fine but you need to split your bulk into smaller bulk requests.
Don't send a 10gb bulk in one call! :slight_smile:

David

Le 21 avr. 2015 à 00:40, TB txindfun@gmail.com a écrit :

We are planning to bulk insert about 10 Gig data ,however we are being forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch out for?

any advice would be great

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/479edffe-e780-4858-b093-676b1837d668%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65B2211D-B9EE-42B4-B522-4B21624B47AF%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Hi,

The best way to approach this is to restrict the size of your bulk request
and / or the number of documents for each request.

I tend to do both, the best sizes seem to be in the 5 to 10 MiB range,
however, I also restrict (which isn't really necessary) the max number of
documents (e.g. 5000) for one request.

You should check out this link from the documentation:

  • Chris

On Tuesday, 21 April 2015 00:40:43 UTC+2, TB wrote:

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/93802357-b65f-4a85-8a86-58c6a8fa7cfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David and Christopher, thanks for your advice, i did split the files into
12 MB chunks,which was found to be optimum after testing various sizes.
I wanted draw from your experience of potential issues, w.r.to bulk
indexing from local vs bulk indexing from remote.
i did choose to bulk index locally

On Monday, April 20, 2015 at 5:40:43 PM UTC-5, TB wrote:

We are planning to bulk insert about 10 Gig data ,however we are being
forced to do this from a remote host.
Is this a good practice? And are there any potential issues i should watch
out for?

any advice would be great

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a144d5f5-4a2c-46ff-9709-061e2a7bfe6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.