Add filter to send specific fields to elasticsearch


(Karthik) #1

Hello Team,
Greetings

I am looking to add only specific fields to send via filebeat or
I am looking to receive only specific fields in elasticsearch or
I am looking for Logstash to filter the incoming fields and store only those which I wanted in the db.
My ultimate goal is to have only few fields which I wanted to come in logs upon looking at Kibana.


#2

You could try a logstash prune filter with the whitelist_names option.


(Karthik) #3

@Badger
Am quite new to use this rule,
Could you please explain me with example?


(Karthik) #4

So I have been using this filter which is the default one I found in the website.
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
}

date {

match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}

}


#5

OK, I think I misinterpreted your question. What does your input look like, and what do you want to see in elasticsearch?


(Karthik) #6

Hey @Badger

Here it is, am pasting entire config.
input {
beats {
port => 5044
ssl => false
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
}

date {

match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}

}
output {
elasticsearch {
hosts => localhost
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
}
stdout {
codec => rubydebug
}
}

and now my logs are coming as

{
"_index": "filebeat-2019.02.12",
"_type": "doc",
"_id": "skuhlhdunA-Ssjuo",
"_version": 1,
"_score": null,
"_source": {
"beat": {
"name": "ABC",
"hostname": "ABC",
"version": "6.X.X"
},
"host": {
"name": "ABC",
"id": "acjkgdljhvjhfgfjhlebkjhdg",
"architecture": "",
"containerized": true,
"os": {
"platform": "aaa",
"codename": "Core",
"version": "7 ",
"family": "zaaaa"
}
},
"message": "The log with the affected area",
"offset": 6743534,
"input": {
"type": "log"
},
"prospector": {
"type": "log"
},
"meta": {
"cloud": {
"machine_type": "qqqqqq",
"region": "",
"availability_zone": "sdfsfd",
"instance_id": "fsdfsdfsfsfdf",
"provider": "sfds"
}
},
"tags": [
"beats_input_codec_plain_applied"
],
"@timestamp": "2019-02-12T17:20:46.163Z",
"source": "error.log",
"@version": "1"
},
"fields": {
"@timestamp": [
"2019-02-12T17:20:46.163Z"
]s
},
"sort": [
1549992046163
]
}

The requirement is, I don't want any of those except for few like hostname , message, source,


#7

OK, so maybe a prune filter is what you want. It is not installed by default, the documentation that I linked to before explains how to install it.

filter {
  prune {
    whitelist_names => ["^hostname$", "^message$", "^source" ]
  }
}

(Karthik) #8

Hey @Badger

To confirm this is how am working on the config
input {
beats {
port => 5044
ssl => false
}
}
filter {
prune {
whitelist_names => ["^hostname$", "^message$", "^source" ]
}
}
output {
elasticsearch {
hosts => localhost
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
}
stdout {
codec => rubydebug
}
}

Please confirm if it looks good, is this what is expected.


#9

That's what I am suggesting, yes.


(Karthik) #10

That didn't help me @Badger
It still giving me the same output.


#11

Did you ingest some new logs? Adding the filter will not change the documents already ingested.


(Karthik) #12

@Badger
Yes I did of course. :smile:
I am getting the entire stack again which I didn't want.


(system) closed #13

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.