Json Parsing using Filebeat


(Sanjeev Ramakrishnan) #1

Hi Team,
I'm trying to parse a log file which contain data in the format of JSON like mentioned below

<
{
agentId: "TMS",
apiVersion: "v2",
entities: [
{
agentId: "ServerName1",
name: "UPCWOCHKDGT",
cacheManagerName: "RT_Article_CacheMgr",
attributes: {
Size: 799625,
NonStopTimeoutRate: 0,
LocalOffHeapSizeInBytes: 0,
LocalDiskSizeInBytes: 0,
CacheSearchRate: 0,
CacheRemoveRate: 0,
CacheOffHeapMissRate: 0,
CacheOnDiskHitRate: 0,
WriterQueueLength: 0,
CacheOffHeapHitRate: 0,
CacheExpirationRate: 0,
LocalHeapSize: 0,
NonStopFailureRate: 0,
CacheOnDiskMissRate: 0,
CacheInMemoryMissRate: 0,
TransactionCommitRate: 0,
LocalHeapSizeInBytes: 0,
NonStopRejoinTimeoutRate: 0,
TransactionRollbackRate: 0,
CacheHitRate: 0,
CacheEvictionRate: 0,
NonStopSuccessRate: 0,
LocalOffHeapSize: 0,
CacheInMemoryHitRate: 0,
LocalDiskSize: 0,
CacheUpdateRate: 0
}
},
{
agentId: "ServerName2",
name: "XRefUPC14Digit",
cacheManagerName: "RT_Article_CacheMgr",
attributes: {
Size: 984362,
NonStopTimeoutRate: 0,
LocalOffHeapSizeInBytes: 0,
LocalDiskSizeInBytes: 0,
CacheSearchRate: 0,
CacheRemoveRate: 0,
CacheOffHeapMissRate: 0,
CacheOnDiskHitRate: 0,
WriterQueueLength: 0,
CacheOffHeapHitRate: 0,
CacheExpirationRate: 0,
LocalHeapSize: 0,
NonStopFailureRate: 0,
CacheOnDiskMissRate: 0,
CacheInMemoryMissRate: 0,
TransactionCommitRate: 0,
LocalHeapSizeInBytes: 0,
NonStopRejoinTimeoutRate: 0,
TransactionRollbackRate: 0,
CacheHitRate: 0,
CacheEvictionRate: 0,
NonStopSuccessRate: 0,
LocalOffHeapSize: 0,
CacheInMemoryHitRate: 0,
LocalDiskSize: 0,
CacheUpdateRate: 0
}
}
.
.
.
/>

I need to get each line as a field name and value (i.e) For Example the JSON line, Size: 799625 this should be parsed as "FieldName" => "Size" and "Value" => "799625". I need to display the data under the 'attributes' section in a Datatable in kibana based on the 'agentId' field value.

I tried using 'Multiline' configuration along with the 'decode_json_fields' as mentioned below with no json configuration in logstash side.

<
multiline.pattern: ^{
multiline.negate: true
multiline.match: after

  • decode_json_fields:
    fields: ["message"]
    target: json
    />

When i try this the entire JSON message is parsed into a single message field (i.e)

<
{
agentId: "PerfWAG1$dlap-w1intg0319.walgreens.com_37360",
name: "WICUPC",
cacheManagerName: "RT_Article_CacheMgr",
attributes: {
Size: 1066028,
NonStopTimeoutRate: 0,
LocalOffHeapSizeInBytes: 0,
LocalDiskSizeInBytes: 0,
CacheSearchRate: 0,
CacheRemoveRate: 0,
CacheOffHeapMissRate: 0,
CacheOnDiskHitRate: 0,
WriterQueueLength: 0,
CacheOffHeapHitRate: 0,
CacheExpirationRate: 0,
LocalHeapSize: 100,
NonStopFailureRate: 0,
CacheOnDiskMissRate: 0,
CacheInMemoryMissRate: 0,
TransactionCommitRate: 0,
LocalHeapSizeInBytes: 0,
NonStopRejoinTimeoutRate: 0,
TransactionRollbackRate: 0,
CacheHitRate: 0,
CacheEvictionRate: 0,
NonStopSuccessRate: 0,
LocalOffHeapSize: 0,
CacheInMemoryHitRate: 0,
LocalDiskSize: 0,
CacheUpdateRate: 0
}
},
>

< logstash configuration

input {
beats {
port => 5044
}
}

output{
elasticsearch {
hosts => ["localhost9200"]
index => "sample"
}
stdout {
codec => rubydebug
}
}

Am i missing out on anything? Should I configure logstash along with this to get my required output?
I'm new to parsing json data through ELK . It'd be great if someone can help me with this issue.

Thank you in advance.


(Anthony Lazam) #2

Hi Sanjeev,

Is there any reason why you're using multiline?

I suggest to use Filebeat's prospector configuration for JSON (Docs) since you're having JSON object per line.

Also, if you're only parsing data from Filebeat and only to Elasticsearch then I recommand you to look at Pipeline feature of Ingest node (Elasticsearch).


(Sanjeev Ramakrishnan) #3

Hi Anthony Lazam,

Thank you for taking your time to reply back. And sorry for the late response

I was using mutliline so that i can process the data in the logstash end.

I did use json configuration like they have mentioned in the documentation but I got each line as a separate message as an output, which is not the output we're expecting so I used multiline configuration.

It'd be great if you can suggest me any better way of parsing this nested json.