Bulk indexing with EC2 cluster?


(IronMan2014) #1

I am having some issues and I would like some feedback:

#1 - I run a test with 250 MB worth of documents against my local machine
which is an i7, it takes total of 130 secs to index. I run it against a
cluster of 2 i2x4 large EC2 instances, much more powerful than my local
machine, yet it takes about 200 secs for the same test.

#2, When I index against local machine, it shows 1500 docs indexed total,
however on the 2 instances, I see 1150 docs, why is it different.

#3, Aside from the above, a separate test, if I run smaller # of docs say
500, The bulk gets called but never executes, the bulk and index.close()
exit without the Bulk.execute, I look at the index, it is empty, no docs
were actually indexed, but against my local machine this doesn't happen.

Some settings:

BulkSize: 1000 docs & 5 MB

Settings settings = ImmutableSettings.settingsBuilder()

                                   .put("client.transport.sniff", true)

                                   .put("refresh_interval", "-1")  

                                 .put("number_of_shards", 1)

                                 .put("number_of_replicas", "0")

                               .put("cluster.name", this.CLUSTER_NAME)

                           .build();

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e55dd715-43dd-45a8-ad50-d543792f0481%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Binh Ly-2) #2
  1. There is network latency going up to EC2. :slight_smile:

2/3. Not sure, can you show your bulk code? Or did you check the logs on
your EC2 instances to see if there were any errors?

On Tuesday, March 25, 2014 1:52:57 PM UTC-4, IronMan2014 wrote:

I am having some issues and I would like some feedback:

#1 - I run a test with 250 MB worth of documents against my local machine
which is an i7, it takes total of 130 secs to index. I run it against a
cluster of 2 i2x4 large EC2 instances, much more powerful than my local
machine, yet it takes about 200 secs for the same test.

#2, When I index against local machine, it shows 1500 docs indexed total,
however on the 2 instances, I see 1150 docs, why is it different.

#3, Aside from the above, a separate test, if I run smaller # of docs say
500, The bulk gets called but never executes, the bulk and index.close()
exit without the Bulk.execute, I look at the index, it is empty, no docs
were actually indexed, but against my local machine this doesn't happen.

Some settings:

BulkSize: 1000 docs & 5 MB

Settings settings = ImmutableSettings.settingsBuilder()

                                   .put("client.transport.sniff", true

)

                                   .put("refresh_interval", "-1")  

                                 .put("number_of_shards", 1)

                                 .put("number_of_replicas", "0")

                               .put("cluster.name", this.CLUSTER_NAME)

                           .build();

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1a89bbdd-8db3-4499-abc2-ea841a865b56%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(IronMan2014) #3

Thanks, I was thinking the same about network latency, so I am going to try
to run right from the instance itself.
I will check on other issues and update my post.

On Wednesday, March 26, 2014 4:06:33 PM UTC-4, Binh Ly wrote:

  1. There is network latency going up to EC2. :slight_smile:

2/3. Not sure, can you show your bulk code? Or did you check the logs on
your EC2 instances to see if there were any errors?

On Tuesday, March 25, 2014 1:52:57 PM UTC-4, IronMan2014 wrote:

I am having some issues and I would like some feedback:

#1 - I run a test with 250 MB worth of documents against my local machine
which is an i7, it takes total of 130 secs to index. I run it against a
cluster of 2 i2x4 large EC2 instances, much more powerful than my local
machine, yet it takes about 200 secs for the same test.

#2, When I index against local machine, it shows 1500 docs indexed total,
however on the 2 instances, I see 1150 docs, why is it different.

#3, Aside from the above, a separate test, if I run smaller # of docs say
500, The bulk gets called but never executes, the bulk and index.close()
exit without the Bulk.execute, I look at the index, it is empty, no docs
were actually indexed, but against my local machine this doesn't happen.

Some settings:

BulkSize: 1000 docs & 5 MB

Settings settings = ImmutableSettings.settingsBuilder()

                                   .put("client.transport.sniff", 

true)

                                   .put("refresh_interval", "-1")  

                                 .put("number_of_shards", 1)

                                 .put("number_of_replicas", "0")

                               .put("cluster.name", this.CLUSTER_NAME

)

                           .build();

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7eddc0ed-3c72-446f-882f-9fbbe9dce9f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4