Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in following
link .
Thanks
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in following
link .
Thanks
You should use bulk indexing instead of indexing every individual
document separately. You are also refreshing the index after every
single index operation, which is time-consuming. Use bulk indexing and
disable refreshing (set the interval to -1) during batch indexing.
--
Ivan
On Wed, Aug 1, 2012 at 4:06 AM, Simi MA simi.ma@algotree.com wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in following
link .
Elasticsearch client · GitHubThanks
Hello ,
The case here is that the feeds come one by one and we are looking for a
real time indexing solution.
So we cant use bulk here.
Also it is observed that with the 1.5 L feeds in ES , it takes like 45
minutes to index a single feed using transport portocall on port 9300
But it only takes a second using the standard 9200 port using curl tool.
What do you feel is the reason for this ?
Thanks
Vineeth
On Wed, Aug 1, 2012 at 10:03 PM, Ivan Brusic ivan@brusic.com wrote:
You should use bulk indexing instead of indexing every individual
document separately. You are also refreshing the index after every
single index operation, which is time-consuming. Use bulk indexing and
disable refreshing (set the interval to -1) during batch indexing.--
IvanOn Wed, Aug 1, 2012 at 4:06 AM, Simi MA simi.ma@algotree.com wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in
following
link .
Elasticsearch client · GitHubThanks
I thought 9300 was for ES nodes to
Communicate with each other - should you be using 9200 instead?
--
Shaun
On Friday, 3 August 2012 at 15:39, Vineeth Mohan wrote:
Hello ,
The case here is that the feeds come one by one and we are looking for a real time indexing solution.
So we cant use bulk here.
Also it is observed that with the 1.5 L feeds in ES , it takes like 45 minutes to index a single feed using transport portocall on port 9300
But it only takes a second using the standard 9200 port using curl tool.What do you feel is the reason for this ?
Thanks
VineethOn Wed, Aug 1, 2012 at 10:03 PM, Ivan Brusic <ivan@brusic.com (mailto:ivan@brusic.com)> wrote:
You should use bulk indexing instead of indexing every individual
document separately. You are also refreshing the index after every
single index operation, which is time-consuming. Use bulk indexing and
disable refreshing (set the interval to -1) during batch indexing.--
IvanOn Wed, Aug 1, 2012 at 4:06 AM, Simi MA <simi.ma@algotree.com (mailto:simi.ma@algotree.com)> wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in following
link .
Elasticsearch client · GitHubThanks
Its the port used for transport client -
@Shay - Please let me know what you think about this.
Thanks
Vineeth
On Fri, Aug 3, 2012 at 12:27 PM, Shaun Etherton shaun.etherton@gmail.comwrote:
I thought 9300 was for ES nodes to
Communicate with each other - should you be using 9200 instead?--
ShaunOn Friday, 3 August 2012 at 15:39, Vineeth Mohan wrote:
Hello ,
The case here is that the feeds come one by one and we are looking for a
real time indexing solution.
So we cant use bulk here.
Also it is observed that with the 1.5 L feeds in ES , it takes like 45
minutes to index a single feed using transport portocall on port 9300
But it only takes a second using the standard 9200 port using curl tool.What do you feel is the reason for this ?
Thanks
VineethOn Wed, Aug 1, 2012 at 10:03 PM, Ivan Brusic ivan@brusic.com wrote:
You should use bulk indexing instead of indexing every individual
document separately. You are also refreshing the index after every
single index operation, which is time-consuming. Use bulk indexing and
disable refreshing (set the interval to -1) during batch indexing.--
IvanOn Wed, Aug 1, 2012 at 4:06 AM, Simi MA simi.ma@algotree.com wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in
following
link .
Elasticsearch client · GitHubThanks
Oh, I see.
Shaun
On Friday, 3 August 2012 at 18:42, Vineeth Mohan wrote:
Its the port used for transport client - Elasticsearch Platform — Find real-time answers at scale | Elastic
@Shay - Please let me know what you think about this.
Thanks
VineethOn Fri, Aug 3, 2012 at 12:27 PM, Shaun Etherton <shaun.etherton@gmail.com (mailto:shaun.etherton@gmail.com)> wrote:
I thought 9300 was for ES nodes to
Communicate with each other - should you be using 9200 instead?--
ShaunOn Friday, 3 August 2012 at 15:39, Vineeth Mohan wrote:
Hello ,
The case here is that the feeds come one by one and we are looking for a real time indexing solution.
So we cant use bulk here.
Also it is observed that with the 1.5 L feeds in ES , it takes like 45 minutes to index a single feed using transport portocall on port 9300
But it only takes a second using the standard 9200 port using curl tool.What do you feel is the reason for this ?
Thanks
VineethOn Wed, Aug 1, 2012 at 10:03 PM, Ivan Brusic <ivan@brusic.com (mailto:ivan@brusic.com)> wrote:
You should use bulk indexing instead of indexing every individual
document separately. You are also refreshing the index after every
single index operation, which is time-consuming. Use bulk indexing and
disable refreshing (set the interval to -1) during batch indexing.--
IvanOn Wed, Aug 1, 2012 at 4:06 AM, Simi MA <simi.ma@algotree.com (mailto:simi.ma@algotree.com)> wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in following
link .
Elasticsearch client · GitHubThanks
I'm seeing similar results, if anyone has any suggestions that would be
great
On Wednesday, August 1, 2012 6:06:56 AM UTC-5, Aami wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in
following link .
Elasticsearch client · GitHubThanks
Removing refresh should help.
Then, for bulk indexing, it's best to use bulk features.
See: Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/guide/reference/java-api/bulk.html
Hope this helps.
David.
Le 9 août 2012 à 10:33, Wes Plunk wes@wesandemily.com a écrit :
I'm seeing similar results, if anyone has any suggestions that would be great
On Wednesday, August 1, 2012 6:06:56 AM UTC-5, Aami wrote:
Hi
I am using java API to index data and my ES version is 0.19.8 .It worked
fine and about 155000 data got indexed .
After that ES taking too much time to index the data . Code is in
following link .
Elasticsearch client · GitHubThanks
<https://gist.github.com/3225800>
https://gist.github.com/3225800
--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.