Reindexing data mongodb-es


(coys) #1

Hello, sorry if I've asked a silly question but I can't figure out the
solution.
I have data stored in mongodb and the collections are mapped to es
indices using richardwilly's plugin. However, a couple of my indices
are messed up (due to which not all the data that I expect to see is
in es (its still in mongodb)). I tried creating a dummy index on dummy
data and I expect that after re-indexing I will now see this data in
es.

The problem seems to be that the mongo river operates on the oplog and
after I delete the index, after inserting the next first new document
I want to see the other thousands of documents in mongodb to
automatically now be visible in es. However, I only see the documents
that I inserted after deleting and recreating the indexes. The other
1000's of documents are still visible in mongo but not in es.

I did a small experiment and I saw that if I actually reinserted the
500 documents, they are then visible in elasticsearch(if the index is
right to allow them all in). Can you please tell me how I can make the
data in mongodb visible in es after I recreate the index without
having to delete and reinsert as I cannot do this. Do I need to replay
the oplog or is there another approach that you can suggest such that
I can get this data into es without deleting and reinserting?

Thanks!

--


(David Pilato) #2

Did you try to remove and create again the river itself?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 27 sept. 2012 à 23:24, coys gautam719@gmail.com a écrit :

Hello, sorry if I've asked a silly question but I can't figure out the
solution.
I have data stored in mongodb and the collections are mapped to es
indices using richardwilly's plugin. However, a couple of my indices
are messed up (due to which not all the data that I expect to see is
in es (its still in mongodb)). I tried creating a dummy index on dummy
data and I expect that after re-indexing I will now see this data in
es.

The problem seems to be that the mongo river operates on the oplog and
after I delete the index, after inserting the next first new document
I want to see the other thousands of documents in mongodb to
automatically now be visible in es. However, I only see the documents
that I inserted after deleting and recreating the indexes. The other
1000's of documents are still visible in mongo but not in es.

I did a small experiment and I saw that if I actually reinserted the
500 documents, they are then visible in elasticsearch(if the index is
right to allow them all in). Can you please tell me how I can make the
data in mongodb visible in es after I recreate the index without
having to delete and reinsert as I cannot do this. Do I need to replay
the oplog or is there another approach that you can suggest such that
I can get this data into es without deleting and reinserting?

Thanks!

--

--


(coys) #3

Thanks a lot David! That worked :slight_smile:

Hello, sorry if I've asked a silly question but I can't figure out the
solution.
I have data stored in mongodb and the collections are mapped to es
indices using richardwilly's plugin. However, a couple of my indices
are messed up (due to which not all the data that I expect to see is
in es (its still in mongodb)). I tried creating a dummy index on dummy
data and I expect that after re-indexing I will now see this data in
es.

The problem seems to be that the mongo river operates on the oplog and
after I delete the index, after inserting the next first new document
I want to see the other thousands of documents in mongodb to
automatically now be visible in es. However, I only see the documents
that I inserted after deleting and recreating the indexes. The other
1000's of documents are still visible in mongo but not in es.

I did a small experiment and I saw that if I actually reinserted the
500 documents, they are then visible in elasticsearch(if the index is
right to allow them all in). Can you please tell me how I can make the
data in mongodb visible in es after I recreate the index without
having to delete and reinsert as I cannot do this. Do I need to replay
the oplog or is there another approach that you can suggest such that
I can get this data into es without deleting and reinserting?

Thanks!

--

--


(system) #4