We have now a nice BulkProcessor class which handle that properly.
--
David 
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 13 sept. 2013 à 08:25, Jun Ohtani johtani@gmail.com a écrit :
Hi David,
Thanks for replay.
I open a issue #12 in github project.
I have another idea.
Currently, processing bulk is only bulkSize check.
I suggest that PageCallback.processBulkIfNeeded process bulk after interval specified parameter, like a Solr auto-commit .
But I have no idea to implement this at the present time.
I will try to think about that a little as well.
Jun Ohtani
blog : http://blog.johtani.info
On 2013/09/13, at 15:05, David Pilato david@pilato.fr wrote:
You're probably right. I fixed something like this in other rivers.
Could you open an issue in wikipedia river project?
--
David 
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 13 sept. 2013 à 07:26, Jun Ohtani johtani@gmail.com a écrit :
Hi all,
I use river-wikipedia to index Japanese wikipedia xml.
When I use parameter "bulk_size" : 10000, the number of indexed documents is 1540000.
But xml include 1546721 pages.
I have a question after seeing WikipediaRiver.java source code.
Probably, the reason is that PageCallback.processBulkIfNeeded() method index document only ,
if the number of buffering documents is more than bulkSize .
I suppose WikipediaRiver.close() method index the remainining documents,
only this method is useful deleting river settings.
What do you think about it?
Jun Ohtani
twitter : http://twitter.com/johtani
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.