River plugin clarification

Janusz_Dalecki · January 2, 2013, 9:27am

Hi,

The doc on elasticsearch River plugin says:

“A river instance (and its name) is a type within the _river index. All
different rivers implementations accept a document called _meta that at the
very least has the type of the river (twitter / couchdb / …) associated
with it.”

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch
documentation?

http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:

*curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ *

"type": "mongodb", // type*
"mongodb": { // mongodb
instance – does it have to be the same as url type?*

   "db": "testmongo",                                       // I

think that strightforward*

   "collection": "person"                                 // I think

that strightforward*

}, *
"index": {*
```
   "name": "mongoindex", *
```

   "type": "person"                                         // why do

I have to repeat it again (its defined in as a collection)?*

}*

}'

```
      *_river – an index*
```
```
      *mongodb – a type*
```
```
      *_meta – an id*
```

Regards,

Janusz

--

radu_gheorghe · January 3, 2013, 11:07am

Hi,

On Wed, Jan 2, 2013 at 11:27 AM, JD jdalecki@tycoint.com wrote:

Hi,****

The doc on elasticsearch River plugin says:****

“A river instance (and its name) is a type within the _river index. All
different rivers implementations accept a document called _meta that at the
very least has the type of the river (twitter / couchdb / …) associated
with it.”****

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch
documentation?

Yes, "_meta" is the document ID, as far as I understand.

http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:****

*curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ *

"type": "mongodb", // type

"mongodb": { // mongodb
instance – does it have to be the same as url type?*

No, the URL type is the name of your river (which can be anything AFAIK),
while "mongodb" is a field that's required by the mongodb plugin.

**
   "db": "testmongo",                                       // I
think that strightforward*
   "collection": "person"                                 // I
think that strightforward*
}, *

"index": {*
   "name": "mongoindex", *
   "type": "person"                                         // why
do I have to repeat it again (its defined in as a collection)?*

The type here is the ES type you're indexing data in from your collection.
It doesn't have to have the same name, it can be anything.

**

}*

}'

```
      *_river – an index*
```
```
      *mongodb – a type*
```
```
      *_meta – an id*
```

If you want more info about the mongodb river itself, I think the best
place to look (if you didn't already :D) is here:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Janusz_Dalecki · January 4, 2013, 8:05am

Hi,

What I find little bit confusing in mongodb river doc is lack of example
for multi collection setup.

Wiki doc says that you need to create new river for MongoDB collection and
gives this example:

$ curl -XPUT "localhost:9200/_river/mongodb/_meta" -d '

{

"type": "mongodb",

"mongodb": {

"servers":

[

  { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },

  { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }

],

"options": { "secondary_read_preference" : true},

"credentials":

[

  { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },

  { "db": ${mongo.db.name}, "user": ${mongo.db.user}, "password": ${mongo.db.password} }

],

"db": ${mongo.db.name}, 

"collection": ${mongo.collection.name}, 

"gridfs": ${mongo.is.gridfs.collection},

"filter": ${mongo.filter}

},

"index": {

"name": ${es.index.name}, 

"throttle_size": ${es.throttle.size},

"type": ${es.type.name}

}

}'

I tried it and it does not work until I use URL line like this
"localhost:9200/_river/{collection_name_river}/_meta"

… so in other words I need to replace “mongodb” word (in URL part) by the
unique river collection name, which I think is not clearly stated in
documentation. I am right?

Regards,

Janusz

On Wednesday, 2 January 2013 20:27:04 UTC+11, JD wrote:

Hi,

The doc on elasticsearch River plugin says:

“A river instance (and its name) is a type within the _river index. All
different rivers implementations accept a document called _meta that at the
very least has the type of the river (twitter / couchdb / …) associated
with it.”

Isn’t “_meta” word an ‘id’ or ‘_action’ according to the elasticsearch
documentation?

http://host:port/[index]/[type]/[_action/id]

Can somebody give us a good example description, like:

*curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ *
"type": "mongodb", // type

"mongodb": { // mongodb
instance – does it have to be the same as url type?*
   "db": "testmongo",                                       // I 
think that strightforward*
   "collection": "person"                                 // I 
think that strightforward*
}, *

"index": {*
   "name": "mongoindex", *
   "type": "person"                                         // why 
do I have to repeat it again (its defined in as a collection)?*

}*

}'
      *_river – an index*
      *mongodb – a type*
      *_meta – an id*
Regards,

Janusz

--

radu_gheorghe · January 5, 2013, 4:18pm

Hi Janusz,

On Fri, Jan 4, 2013 at 10:05 AM, JD jdalecki@tycoint.com wrote:

Hi,****

What I find little bit confusing in mongodb river doc is lack of example
for multi collection setup.****

Wiki doc says that you need to create new river for MongoDB collection and
gives this example:****

$ curl -XPUT "localhost:9200/_river/mongodb/_meta" -d '****

{****

"type": "mongodb",****

"mongodb": { ****
"servers":****

[****

  { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },****

  { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }****

],****

"options": { "secondary_read_preference" : true},****

"credentials":****

[****

  { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },****

  { "db": ${mongo.db.name}, "user": ${mongo.db.user}, "password": ${mongo.db.password} }****

],****

"db": ${mongo.db.name}, ****

"collection": ${mongo.collection.name}, ****

"gridfs": ${mongo.is.gridfs.collection},****

"filter": ${mongo.filter}****
}, ****

"index": { ****
"name": ${es.index.name}, ****

"throttle_size": ${es.throttle.size},****

"type": ${es.type.name}****
}****

}'****

I tried it and it does not work until I use URL line like this
"localhost:9200/_river/{collection_name_river}/_meta"****

… so in other words I need to replace “mongodb” word (in URL part) by the
unique river collection name, which I think is not clearly stated in
documentation. I am right?

Yeah, I don't see anything like that in the documentation. It's the first
time I see that being reported, although I don't use the mongodb river
plugin myself.

Either way, it seems to me that if you want to use the river with multiple
collections you'd have to set up one river for each collection.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Topic		Replies	Views
Adding a river using the python driver Elasticsearch	3	449	July 6, 2017
Using the elasticsearch-mongodb-river to index multiple collections in the same db Elasticsearch	3	1385	July 6, 2017
MongoDb river Elasticsearch	4	347	July 6, 2017
MongoDB River Plugin 1.1.0 Elasticsearch	9	1444	July 6, 2017
Map CouchDB document types to ElasticSearch types Elasticsearch	5	392	July 6, 2017

River plugin clarification

Best regards, Radu

Best regards, Radu

Related topics

Best regards,
Radu

Best regards,
Radu