Trouble with Json filtering


(sparkie) #1

Hi,

Im having some issues with trying to get some json into ES.

Im using filebeat to pass the json data over tcp to logstash ... this is a sample of the JSON, the JSON format is correct its just not feeding in the data, i can see what its doing in with the output, but what i see in the debug output isnt what going into ES.


{"classification.taxonomy": "abusive content", "raw": "MS4zMi4xMjguMC8xOCA7IFNCTDI4NjI3NQ==", "feed.accuracy": 100.0, "classification.type": "spam", "feed.provider": "Spamhaus", "feed.url": "https://www.spamhaus.org/", "feed.name": "Spamhaus Drop", "time.source": "2017-12-29T19:49:07+00:00", "time.observation": "2018-01-10T04:05:38+00:00", "extra": "{"blocklist": "SBL286275"}", "source.network": "1.32.128.0/18"}
{"classification.taxonomy": "abusive content", "raw": "NS44LjM3LjAvMjQgOyBTQkwyODQwNzg=", "feed.accuracy": 100.0, "classification.type": "spam", "feed.provider": "Spamhaus", "feed.url": "https://www.spamhaus.org/", "feed.name": "Spamhaus Drop", "time.source": "2017-12-29T19:49:07+00:00", "time.observation": "2018-01-10T04:05:38+00:00", "extra": "{"blocklist": "SBL284078"}", "source.network": "5.8.37.0/24"}


input {
beats {
port => 9515
codec => json
type => data
}
}
filter {
if [type] == "data" {
kv{
}
}
}
output {
if [type] == "data" {
stdout { codec => rubydebug }
elasticsearch { hosts => ["127.0.0.1:9200"]
index => "idata-%{+YYYY.MM}"
}
}
}


Debug output


{
"feed.url" => "https://www.spamhaus.org/",
"feed.provider" => "Spamhaus",
"offset" => 7238348,
"time.observation" => "2018-01-27T01:24:43+00:00",
"input_type" => "log",
"raw" => "MjIzLjE3My4wLjAvMTYgOyBTQkwyMDQ5NTQ=",
"feed.name" => "Spamhaus Drop",
"source" => "/opt/file-output/spamhous.txt",
"source.network" => "223.173.0.0/16",
"type" => "data",
"tags" => [
[0] "beats_input_codec_json_applied"
],
"@timestamp" => 2018-01-31T04:40:40.344Z,
"time.source" => "2018-01-25T14:32:35+00:00",
"classification.type" => "spam",
"extra" => "{"blocklist": "SBL204954"}",
"@version" => "1",
"beat" => {
"name" => "blar",
"hostname" => "blar",
"version" => "5.4.1"
},
"host" => "blar",
"classification.taxonomy" => "abusive content",
"feed.accuracy" => 100.0
}


Kibana

name type format searchable aggregatable excluded controls
_id string
_index string
_score number
_source _source
_type string

So none of the fields have arrived, which means there is no TIME field either so i can only creat an index which have no time reliance.

In Cerebro the entire index is only 810b :frowning: the file which its importing is over 100megs

any help would be appreciated.

Thanks.


{
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count": 0,
"deleted": 0
},
"store": {
"size": "810b",
"size_in_bytes": 810,
"throttle_time": "0s",
"throttle_time_in_millis": 0


(Naveen M) #2

You try removing if condition in output, you will see arriving input to ES/kibana as below.
output {
elasticsearch {
hosts => "127.0.0.1:9200"
}
stdout { codec => rubydebug }
}


(sparkie) #3

I have the If condition there so that i see only the output from that ingest, i have a number of other things being ingested . the output above is a sample of the rubydebug. I know this is working as when i delete the index it gets recreated, i can also feed other json samples in using that filter and these work fine, I have validated the Json for its structure and this too is fine, it just seems to be ignoring it :frowning:

but thanks for the suggesting Mnaveen_m


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.