Logstash with mongo input adding a default "document" field onto ElasticSearch Output index

Hi there,

I'm new to ELK stack. Although, I've done quite some extensive research and wasn't able to find the answer to my current problem, so I was hoping someone could help me.

I've been trying to ingest documents from MongoDb using Logstash and outputting it ElasticSearch.

The document format that I'm reading from Mongo and then indexing is:

{
	"_id" : "d65c67c9-d229-49de-be85-32d4f3604f0a",
	"createdDate" : ISODate("2020-11-15T17:19:17.581Z"),
	"updatedDate" : ISODate("2020-11-15T17:20:15.582Z"),
	"roomId" : 1,
	"propertyId" : 1,
	"defaultPrice" : "1000",
	"prices" : [ 
		{
			"date" : ISODate("2020-11-16T00:00:00.000Z"),
			"value" : "200"
		}
	],
	"bookedSlots" : [ 
		ISODate("2020-11-16T00:00:00.000Z")
	]
}

And this is the logstash pipeline that i'm using to send the data to Elastic:

input {
   jdbc {
	 jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/mongojdbc2.3.jar"
	 jdbc_driver_class => "com.dbschema.MongoJdbcDriver"
	 jdbc_connection_string => "jdbc:mongodb://mongo/availability"
	 jdbc_user => ""
	 
	 statement => "
		var lastValue = :sql_last_value;
		var extractedDate = lastValue.substring(0,10);
		var extractedTime = lastValue.substring(11,19);
		var concatDateTime = extractedDate + 'T' + extractedTime + 'Z';
		db.rooms.find({ updatedDate: { $gt : new ISODate(concatDateTime)} })"
	 schedule => "/5 * * * * *"
	 last_run_metadata_path => "/usr/share/logstash/mongoSqlLastValue.yaml"
  } 
}

output {
 
  stdout { codec => rubydebug  }

  elasticsearch { 
	index => "availability"
	hosts => ["elasticsearch:9200"]
	doc_as_upsert => true
	document_id => "%{[document][_id]}"
  }
}

This is the document after being indexed by Elastic:

{
	"took": 1145,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 1.0,
		"hits": [
			{
				"_index": "availability",
				"_type": "_doc",
				"_id": "d65c67c9-d229-49de-be85-32d4f3604f0a",
				"_score": 1.0,
				"_source": {
					"@timestamp": "2020-11-20T12:08:55.456Z",
					"@version": "1",
		----------->"document": {<----------
						"roomId": 1,
						"defaultPrice": "1000",
						"_id": "d65c67c9-d229-49de-be85-32d4f3604f0a",
						"createdDate": "2020-11-15T17:19:17.581Z",
						"updatedDate": "2020-11-20T12:08:50.582Z",
						"propertyId": 1,
						"bookedSlots": [
							"2020-11-16T00:00:00.000Z"
						],
						"prices": [
							{
								"date": "2020-11-16T00:00:00.000Z",
								"value": "200"
							}
						]
					}
				}
			}
		]
	}
}

I don't understand the reason why logstash sends the collection data from Mongo with a "document" field onto Elastic.

When I then use ElasticSearch NEST Client in .NET and try to find a document with roomId, for example:

client.SearchAsync<Room>(x =>
            x.Query(q =>
                q.Term(m => m
                    .Field(x => x.RoomId)
                    .Value(roomId)))); 

I don't get any results because RoomId is not at the room of the document.

Is there a way to ignore this "document" and have all fields at the root?
I have another pipeline which uses SQL as input, and the fields in that pipeline are added without the "document" field.

Thanks for any help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.