ES - 1.4.2 - Uploading index data seems low and slow

I have just migrated to ES 1.4.2 - I have 5 data nodes and 1 master node -
all these ES instances are having 32 gigs of heap (machine has 120 gigs) -
2tb of storage each - 16 cpus
I am trying to upload 3 indexes simultaneously - each of them are of size
12 gigs - i am trying to upload this via pig job and noticed that uploading
to ES is running really low and slow

Earlier I was on ES 1.2.2 and on that I always have got better upload speed
as compared 1.4.2 - Is there a setting which I should enable to boost up
the upload?

Previously (Everything was running on single node - i.e. data, master,
client etc) + currently (little different architecture as I mentioned in my
first line) for each index I have created 5 shards and 1 replica - Don't
know what has been messed up in 1.4.2 version - anyone has any idea on it?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d4709416-6d96-4a52-b2b3-e246f6bdc0a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

FYI once you get to 32GB heap you lose some efficiency, try to keep heap
under 32GB, so 31GB or less.

Are you using the bulk API?

On 15 January 2015 at 10:03, Bhumir Jhaveri bhumir81@gmail.com wrote:

I have just migrated to ES 1.4.2 - I have 5 data nodes and 1 master node -
all these ES instances are having 32 gigs of heap (machine has 120 gigs) -
2tb of storage each - 16 cpus
I am trying to upload 3 indexes simultaneously - each of them are of size
12 gigs - i am trying to upload this via pig job and noticed that uploading
to ES is running really low and slow

Earlier I was on ES 1.2.2 and on that I always have got better upload
speed as compared 1.4.2 - Is there a setting which I should enable to boost
up the upload?

Previously (Everything was running on single node - i.e. data, master,
client etc) + currently (little different architecture as I mentioned in my
first line) for each index I have created 5 shards and 1 replica - Don't
know what has been messed up in 1.4.2 version - anyone has any idea on it?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d4709416-6d96-4a52-b2b3-e246f6bdc0a0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d4709416-6d96-4a52-b2b3-e246f6bdc0a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9QdaFWy5ySS%2B5iPM2KR1EA%3D568THOaT1-t5r2hw6fe1w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Oh yeah - I reset that 31gigs.
I am not sure if I am using pig bulk api - I am uploading it via pig script
since the data to be uploaded is huge

->STORE big_data INTO '$INDEX_NAME/$DOCUMENT' USING
org.elasticsearch.hadoop.pig.EsStorage('es.input.json=true');

On Wednesday, January 14, 2015 at 3:13:55 PM UTC-8, Mark Walkom wrote:

FYI once you get to 32GB heap you lose some efficiency, try to keep heap
under 32GB, so 31GB or less.

Are you using the bulk API?

On 15 January 2015 at 10:03, Bhumir Jhaveri <bhum...@gmail.com
<javascript:>> wrote:

I have just migrated to ES 1.4.2 - I have 5 data nodes and 1 master node

  • all these ES instances are having 32 gigs of heap (machine has 120 gigs)
  • 2tb of storage each - 16 cpus
    I am trying to upload 3 indexes simultaneously - each of them are of
    size 12 gigs - i am trying to upload this via pig job and noticed that
    uploading to ES is running really low and slow

Earlier I was on ES 1.2.2 and on that I always have got better upload
speed as compared 1.4.2 - Is there a setting which I should enable to boost
up the upload?

Previously (Everything was running on single node - i.e. data, master,
client etc) + currently (little different architecture as I mentioned in my
first line) for each index I have created 5 shards and 1 replica - Don't
know what has been messed up in 1.4.2 version - anyone has any idea on it?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d4709416-6d96-4a52-b2b3-e246f6bdc0a0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d4709416-6d96-4a52-b2b3-e246f6bdc0a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/348a80e1-7cac-40ee-a700-f88c15fae409%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.