Filebeat 6.5.4 is truncating some records - multibyte character issue?

On each host, a few Docker 18.09.6 containers use a shared volume to create low-load logfiles (a few entries a minute each). Each test-run takes under an hour, and each test run starts a new logfile. We delete files older than 10 hours (not using filebeat to do that for reasons which elude me at the moment) but we believe records are uploaded well before that limit.

In Kibana, the vast majority of the records are present, but I see a number of error.message records complaining about JSON parsing errors, and when I look in the message, some of the beginning of the line is missing - quite how much seems to change for each line: the records are of wildly variable length, and the truncation doesn't follow an obvious pattern.

When I go to look at the log files themselves, I can grep for the message text, and find the well-formed record in the cited logfile.

I also find files with .0 or .1 and occasionally .2 added to the name, when we only create *.log files, but all the records in those files look well-formed based on a quick scan with cut, rev, and uniq.

I see rev complain about an invalid multibyte character for the particular example file I'm examining, on that machine, but it's fine if I scp the file to my local machine, and in any case, the cut-off point isn't obviously related to where unicode characters appear:

{"platform":"ios","application":"redact","redactrver":"QA","redactst":"redactdevel5.reda","tc_build_id":10097799,"tc_build_type_id":"IOS_FunctionalTests_redact_FunctionalTests_iOS13","tc_build_branch":"ios_test_run_161020","name":"I select \"Liked me\" filter","location":"redactfeatures/functional/connections/liked_you_screen.feature:77","regexp":"/^(?:I|primary_user) selects? \"(All connections|Liked me|Favourites|Chats|Visited me|Online|Избранные)\" filter$/","is_successful":true,"duration":4.82,"type":"step","agent_name":"FastAndFuriousD","build_is_personal":"false","computed_filters":"","triggered_by_build_id":"10097840","tc_build_type_url":"http://redactedan/viewType.html?buildTypeId=IOS_FuncitonalTests_redact_FunctionalTests_iOS13\u0026tab=buildTypeHistoryList\u0026branch=ios_test_run_161020","calabash_branch":"master"}

If I were trying to characterise the bug without reference to the multibyte issue, I would say it looks like filebeat is reading to the of the file, just between two write full buffer operations, and sending partial lines rather than scanning the data backwards for the most recent newline and sending complete lines. I'm sure it's related to multibyte stuff, but that's what it looks like.

The config is this:

filebeat.prospectors:
- type: log
  enabled: true
  json.add_error_key: true
  json.keys_under_root: true
  json.overwrite_keys: true
  paths:
  - /app_logs/*.log

output:
  logstash:
    hosts:
      - "logstash.yyyy:xxxx"
    index: "mobile_qa_functional_tests_beat-%{+YYYY.MM.dd}"

setup.template.name: "mobile_qa_functional_tests_beat"
setup.template.pattern: "mobile_qa_functional_tests_beat-*"
setup.template.append_fields:
  - name: json.stack_trace
    type: text

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.