I am creating index using elasticsearch API but not getting good write
speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings =
ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
I am creating index using elasticsearch API but not getting good write speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings = ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
} else {
Settings settings = ImmutableSettings.settingsBuilder()
.build();
client = new TransportClient(settings).addTransportAddress(new InetSocketTransportAddress(
elasticSearchHost, elasticSearchPort));
}
//Record Dumping Part
Long MobileNumber= new Long(0);
System.out.println(("{"mobileNumberUserId":"+MobileNumber.toString()+"}").getBytes().length);
Long k=null,f=null;
k=new Date().getTime();
I had heard at one point that using the Java API per document vs the bulk
APIs there isn't much difference, perhaps that is (or never was the case).
Anyways, for that specific code chunk, you're calling actionGet on every
call, which makes the call synchronous. Instead you should save off the
action futures and then call the actionGet after they have all been
submitted.
I'd be curious to know how that performs compared to the bulk API.
Best Regards,
Paul
On Monday, January 28, 2013 10:05:17 AM UTC-7, David Pilato wrote:
Le 28 janv. 2013 à 17:54, Ankit Jain <ankitj...@gmail.com <javascript:>>
a écrit :
Hi All,
I am creating index using elasticsearch API but not getting good write
speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings =
ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
writing 100 docs per single doc request: 100 packets to the cluster,
cluster must acknowledge the 100 packets on the wire = 100 roundtrips.
writing 100 doc in a bulk request: 1 packet to the cluster and 1 back
to the client (with reasonable doc size), 1 roundtrip.
So you save huge communication overhead.
Saving actionGet's only creates thread congestion, each will still end
up in a single request and response on the wire.
Jörg
Am 28.01.13 21:23, schrieb ppearcy:
I had heard at one point that using the Java API per document vs the
bulk APIs there isn't much difference, perhaps that is (or never was
the case).
Anyways, for that specific code chunk, you're calling actionGet on
every call, which makes the call synchronous. Instead you should save
off the action futures and then call the actionGet after they have all
been submitted.
I'd be curious to know how that performs compared to the bulk API.
Best Regards,
Paul
On Monday, January 28, 2013 10:05:17 AM UTC-7, David Pilato wrote:
Use the Bulk API:
https://github.com/elasticsearchfr/hands-on/blob/answers/src/test/java/org/elasticsearchfr/handson/ex1/IndexTest.java#L113
<https://github.com/elasticsearchfr/hands-on/blob/answers/src/test/java/org/elasticsearchfr/handson/ex1/IndexTest.java#L113>
Le 28 janv. 2013 à 17:54, Ankit Jain <ankitj...@gmail.com
<javascript:>> a écrit :
Hi All,
I am creating index using elasticsearch API but not getting good
write speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings =
ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
} else {
Settings settings = ImmutableSettings.settingsBuilder()
.build();
client = new
TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(
elasticSearchHost, elasticSearchPort));
}
//Record Dumping Part
**Long MobileNumber= new Long(0);
System.out.println(("{\"mobileNumberUserId\":"+MobileNumber.toString()+"}").getBytes().length);
Long k=null,f=null;
k=new Date().getTime();
for(Long i=(long) 0;i<10000;i++)
{
IndexResponse response2 =
this.client.prepareIndex("dummymobilenumber4","Number",MobileNumber.toString()).setSource(("{\"mobileNumberUserId\":"+MobileNumber.toString()+"}").getBytes()).execute().actionGet();
MobileNumber++;
}
f=new Date().getTime();
System.out.println("Indexing Speed = "+((240)/(f-k))+" Time "+
(f-k));
Guys, required you help to increase the write performance.
Thanks,
Regards,
Ankit Jain
I add to Jorg answer that with bulk, Elasticsearch prepares documents for each shard. Then it has only to send the right docs to the right shard, index all, instead of doing this one by one.
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
writing 100 docs per single doc request: 100 packets to the cluster, cluster must acknowledge the 100 packets on the wire = 100 roundtrips.
writing 100 doc in a bulk request: 1 packet to the cluster and 1 back to the client (with reasonable doc size), 1 roundtrip.
So you save huge communication overhead.
Saving actionGet's only creates thread congestion, each will still end up in a single request and response on the wire.
Jörg
Am 28.01.13 21:23, schrieb ppearcy:
I had heard at one point that using the Java API per document vs the bulk APIs there isn't much difference, perhaps that is (or never was the case).
Anyways, for that specific code chunk, you're calling actionGet on every call, which makes the call synchronous. Instead you should save off the action futures and then call the actionGet after they have all been submitted.
I'd be curious to know how that performs compared to the bulk API.
Best Regards,
Paul
On Monday, January 28, 2013 10:05:17 AM UTC-7, David Pilato wrote:
Le 28 janv. 2013 à 17:54, Ankit Jain <ankitj...@gmail.com
<javascript:>> a écrit :
Hi All,
I am creating index using elasticsearch API but not getting good
write speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings =
ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
} else {
Settings settings = ImmutableSettings.settingsBuilder()
.build();
client = new
TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(
elasticSearchHost, elasticSearchPort));
}
//Record Dumping Part
**Long MobileNumber= new Long(0);
System.out.println(("{"mobileNumberUserId":"+MobileNumber.toString()+"}").getBytes().length);
Long k=null,f=null;
k=new Date().getTime();
Yeah, that's valid point regarding the extra network I/O, it really comes
down to how quickly you want data showing up in search and if it makes
sense to batch items together depending on your use case.
The thread congestion can be mitigated with a different threadpool type for
indexing:
Best Regards,
Paul
On Monday, January 28, 2013 1:37:37 PM UTC-7, David Pilato wrote:
I add to Jorg answer that with bulk, Elasticsearch prepares documents for
each shard. Then it has only to send the right docs to the right shard,
index all, instead of doing this one by one.
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 28 janv. 2013 à 21:32, Jörg Prante <joerg...@gmail.com <javascript:>>
a écrit :
There is a difference:
writing 100 docs per single doc request: 100 packets to the cluster,
cluster must acknowledge the 100 packets on the wire = 100 roundtrips.
writing 100 doc in a bulk request: 1 packet to the cluster and 1 back to
the client (with reasonable doc size), 1 roundtrip.
So you save huge communication overhead.
Saving actionGet's only creates thread congestion, each will still end up
in a single request and response on the wire.
Jörg
Am 28.01.13 21:23, schrieb ppearcy:
I had heard at one point that using the Java API per document vs the
bulk APIs there isn't much difference, perhaps that is (or never was the
case).
Anyways, for that specific code chunk, you're calling actionGet on every
call, which makes the call synchronous. Instead you should save off the
action futures and then call the actionGet after they have all been
submitted.
I'd be curious to know how that performs compared to the bulk API.
Best Regards,
Paul
On Monday, January 28, 2013 10:05:17 AM UTC-7, David Pilato wrote:
Le 28 janv. 2013 à 17:54, Ankit Jain <ankitj...@gmail.com
<javascript:>> a écrit :
Hi All,
I am creating index using elasticsearch API but not getting good
write speed.
Below is code that I used for creating client and record dump.
//creates Client in our case
if (localMode != null && localMode) {
Settings settings =
ImmutableSettings.settingsBuilder().put("node.local",true).build();
client = nodeBuilder().client(true).node().client();
} else {
Settings settings = ImmutableSettings.settingsBuilder()
.build();
client = new
TransportClient(settings).addTransportAddress(new
InetSocketTransportAddress(
elasticSearchHost, elasticSearchPort));
}
//Record Dumping Part
**Long MobileNumber= new Long(0);
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.