Hi there,
I'm using Logstash to take documents from specific MongoDB collection, and save it to Elasticsearch.
Nested fields are being saved to "log_entry" as one JSON, starting with "BSON" or "ID", depends on manipulations I do using filter.
Here is example of the "log_entry":
"log_entry": {\"_id\": \"122ghgh1111, \"msg_body\": {\"text_one\": 2, \"text_data\": [{\"position\": 1}, {....}, {...}]}}
there is a lot of text in log_entry, so i don't post everything, just the structure.
Below is my config (ive tried different ways , so I'll share it all. None of them isn't doing what I'd like to achieve):
logstashPipeline:
logstash.conf:
input {
mongodb {
uri => 'mongodb://user:password@host:port/<db_name>?directConnection=true'
placeholder_db_dir => '/opt/logstash-mongodb'
placeholder_db_name => 'logstash_sqlite.db'
collection => 'my_collection'
codec => "json"
}
}
// First try - still saving nested JSON as one
filter {
mutate {
gsub => [ "log_entry", "=>", ": "]
rename => { "_id" => "mongo_id" }
remove_filed => ["_id"]
}
mutate {
gsub => [ "log_entry", "BSON::ObjectID\('([0-9a-z]+'\)", '"\1']
rename => { "_id" => "mongo_id" }
}
}
// Second try - still saving nested JSON as one
filter {
mutate {
gsub => [ "log_entry", "=>", ": "]
rename => { "_id" => "mongo_id" }
remove_filed => ["_id"]
}
mutate {
gsub => [ "log_entry", "BSON::ObjectID\('([0-9a-z]+'\)", '"\1']
rename => { "_id" => "mongo_id" }
}
grok {
match => { "log_entry" => "%{DATA:log_entry}" }
}
json {
source => "log_entry"
remove_field => ["log_entry"]
}
}
output { elasticsearch {
action => "index"
index => "mongo_log_data"
hosts => ["https://<host>:9200"]
ssl => false
ssl_certificate_verification => false
user => "elastic"
password => "some_password"
}
}
Can you please help me to build a correct working filter to achieve what I need?
Thanks in advance.