MongoDB river not copying all of the data from mongoDB to ES

Enter code here...

Hi:
I have been successful at creating a river between a MongoDB database
and an Elasticsearch instance.
The MongoDB for the database and specific collection has 8M+ documents.
However when the river is setup and running
less than 1/2 the number of docs are copied/transferred.

I am using the elasticsearch-river-mongodb-2.0.7 version with
Elasticsearch 1.4.4

Here is a sampling of the trace log messages from ES :

[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Insert operation - id: 553148b8e4b09c4dd2407f92 - contains attachment: false
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
updateBulkRequest for id: [553148b8e4b09c4dd2407fa6], operation: [INSERT]
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Operation: INSERT - index: twitter - type: one-pct-sane - routing: null -
parent: null
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Insert operation - id: 553148b8e4b09c4dd2407fa6 - contains attachment: false
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
updateBulkRequest for id: [553148bae4b09c4dd2407fd5], operation: [INSERT]
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Operation: INSERT - index: twitter - type: one-pct-sane - routing: null -
parent: null
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Insert operation - id: 553148bae4b09c4dd2407fd5 - contains attachment: false
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
updateBulkRequest for id: [553148bae4b09c4dd2407ffe], operation: [INSERT]
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Operation: INSERT - index: twitter - type: one-pct-sane - routing: null -
parent: null
[2015-04-17 12:56:22,045][TRACE][org.elasticsearch.river.mongodb.Indexer]
Insert operation - id: 553148bae4b09c4dd2407ffe - contains attachment: false
[2015-04-17 12:56:22,055][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiver] setLastTimestamp [one_pct_sane] [one-pct-sane.current] [
Timestamp.BSON(ts={ "$ts" : 1429282637 , "$inc" : 2})]
[2015-04-17 12:56:22,095][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] afterBulk - bulk [57498] success [80 items] [59
ms] total [3952638]
[2015-04-17 12:56:22,217][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] bulkQueueSize [50] - queue [0] - availability [1]
[2015-04-17 12:56:22,217][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] beforeBulk - new bulk [57499] of items [49]
[2015-04-17 12:56:22,259][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] afterBulk - bulk [57499] success [49 items] [42
ms] total [3952687]
[2015-04-17 12:56:22,387][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] bulkQueueSize [50] - queue [0] - availability [1]
[2015-04-17 12:56:22,387][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] beforeBulk - new bulk [57500] of items [1]
[2015-04-17 12:56:22,389][TRACE][org.elasticsearch.river.mongodb.
MongoDBRiverBulkProcessor] afterBulk - bulk [57500] success [1 items] [2 ms]
total [3952688]
[2015-04-17 12:56:22,390][INFO ][cluster.metadata ] [Star-Dancer] [
_river] update_mapping [one_pct_sane] (dynamic)
[2015-04-17 13:06:20,497][INFO ][cluster.metadata ] [Star-Dancer] [
_river] update_mapping [one_pct_sane2] (dynamic)
[2015-04-17 13:06:20,513][INFO ][cluster.metadata ] [Star-Dancer] [
_river] update_mapping [one_pct_sane2] (dynamic)
[2015-04-17 13:06:20,523][INFO ][cluster.metadata ] [Star-Dancer] [
_river] update_mapping [one_pct_sane2] (dynamic)
[2015-04-17 13:06:22,394][INFO ][cluster.metadata ] [Star-Dancer] [
_river] update_mapping [one_pct_sane2] (dynamic)

Amongst the several questions I have, these are some :

  1. Does the river copy data based on what exists in Oplogs ? (how does the
    river use the oplogs to get the data)

  2. There aren't any obvious errors being shown, documents do come in. but
    as I mentioned earlier, less than 1/2 the number of documents in MongoDB
    are being copied over.
    why would that be ?

Thanks for any help/assistance

Ramdev

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fb1f60d1-0204-4021-93a5-85263d939c8c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.