Trying to ingest MongoDB collection to ES with mongo id

Hello,
I am trying to use logstash to ingest a mongodb collection into ES6, but I'm facing a problem with mongo's _Id field.

I've found that many people just solve this issue by removing that id and letting ES generate his own one, but this solution does not fit my needs, as I'd like to maintain the same _id in order to be able to update and delete documents.

Here is my conf file:

input {
mongodb {
uri => 'mongodb://localhost:27017/test'
placeholder_db_dir => '/tmp'
collection => 'testdata'
batch_size => 100
}
}

output {
elasticsearch {
hosts => "10.x.x.x:9200"
index => "testdata"
doc_as_upsert => true
}
}

I'm trying with just very basic data, like

{ "_id" : ObjectId("5a264cace9b55012c6580124"), "myfield" : "myvalue1" }
{ "_id" : ObjectId("5a264d2be9b55012c6580125"), "myfield" : "myvalue2" }
{ "_id" : ObjectId("5a264d2fe9b55012c6580126"), "myfield" : "myvalue3" }

The new ES index is created, but no docs are actually inserted.
The error I find in the log is:

[2017-12-05T10:26:38,312][DEBUG][o.e.a.b.TransportShardBulkAction] [testdata][4] failed to execute bulk item (index) BulkShardRequest [[testdata][4]] containing [index {[testdata][doc][uSnIJWAB-GaCrrUis6xn], source[{"@timestamp":"2017-12-05T08:26:38.227Z","log_entry":"{"_id"=>BSON::ObjectId('5a264d2fe9b55012c6580126'), "myfield"=>"myvalue3"}","logdate":"2017-12-05T07:39:27+00:00","host":"localhost","@version":"1","_id":"5a264d2fe9b55012c6580126","mongo_id":"5a264d2fe9b55012c6580126","myfield":"myvalue3"}]}]
org.elasticsearch.index.mapper.MapperParsingException: Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters

I don't believe I'm the only one with this kind of need, but I could not find anything neither in this site, nor googling around, so I wonder if the solution is just as simple ad adding some parameter to the conf file, but I don't get it :slight_smile:

I've sorted that out :slight_smile:

Basically, the fact is that it's no longer possible to send the _id to ES as part of the document; the solution was to use the document_id option in the output configuration section, after renaming the mongo _id, like in my new conf file:

input {
mongodb {
uri => 'mongodb://localhost:27017/test'
placeholder_db_dir => '/tmp'
collection => 'testdata'
batch_size => 100
}
}

filter {
mutate {
rename => { "_id" => "mongo_id" }
}
}

output {
elasticsearch {
hosts => "10.x.x.x:9200"
index => "testdata"
doc_as_upsert => true
document_id => "%{mongo_id}"
}
}

That was it!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.