This is a followup to this question: Kvpairs where one of the values is a JSON structure
I am using the following filter to interpret my logs:
filter {
grok {
match => {"message" => "msg=%{WORD:action} %{GREEDYDATA:kvpairs}" }
}
kv {
source => "kvpairs"
remove_field => ["kvpairs"]
}
json {
source => "jsondata"
remove_field => ["jsondata"]
}
}
In my input message I have json data that looks like this:
jsondata={"parentId":"80b9b4e0-5552-45ca-a202-245f89e98635","id":"4d1dac94-dabb-4cf1-b8f0-b6c9cd348783","childIds":[],"priority":"LOWEST","inputData":{"dataId":"8083afdf-69c3-40f4-b528-daf39a7c7310","encodingName":"UTF-8","language":{"code":"por","name":"Portuguese"},"script":{"code":"latn","name":"Latin"}},"targetLanguage":{"language":{"code":"eng","name":"English"},"script":{"code":"latn","name":"Latin"}},"targetEngine":{"name":"Cybertrans","versionNumber":{"majorNumber":13,"minorNumber":11,"patchNumber":6}},"outputData":{"dataId":"b3e0cfe6-64e6-48fa-a84a-a8d56d098e69","encodingName":"UTF-8","language":{"code":"eng","name":"English"},"script":{"code":"latn","name":"Latin"}},"status":"COMPLETE","messages":["Error
code from MT engine: NFW\u003d0.41735537190082644","System Selected
Language Pair-NONE\u003dPortuguese\u003eEnglish","System Selected MT
System\u003dMotrans","System Selected Dictionary\u003dgeneral","System
Selected User Dictionary\u003dn/a","System Selected Source
Encoding\u003dutf8","System Selected Text
Corrector\u003dundetermined","The language identified was Spanish utf8,
but it will be processed as Portuguese utf8.","Translation retrieved
from cache."]}
When I look at that processed logs in kibana, I see that jsondata is truncated after the word Error:
{"parentId":"80b9b4e0-5552-45ca-a202-245f89e98635","id":"4d1dac94-dabb-4cf1-b8f0-b6c9cd348783","childIds":[],"priority":"LOWEST","inputData":{"dataId":"8083afdf-69c3-40f4-b528-daf39a7c7310","encodingName":"UTF-8","language":{"code":"por","name":"Portuguese"},"script":{"code":"latn","name":"Latin"}},"targetLanguage":{"language":{"code":"eng","name":"English"},"script":{"code":"latn","name":"Latin"}},"targetEngine":{"name":"Cybertrans","versionNumber":{"majorNumber":13,"minorNumber":11,"patchNumber":6}},"outputData":{"dataId":"b3e0cfe6-64e6-48fa-a84a-a8d56d098e69","encodingName":"UTF-8","language":{"code":"eng","name":"English"},"script":{"code":"latn","name":"Latin"}},"status":"COMPLETE","messages":["Error
This happens even if the length of the string before the word "Error" is changed, so I believe it is indeed the word Error that is somehow triggering the truncation, and not the length of the string. As noted below, after the word Error is where the first whitespace occurs in the string, so that is more likely to be the cause of the problem than the specific word.
Any idea why this is happening, and how I might be able to prevent it? Thanks in advance!