max_bytes works fine on input messages but after applying multiline filter all new line bytes in original message replacing by two printable bytes: \n and we get a little larger size of message that we send further. Is it possible to apply max_bytes after multiline filter to get exactly size of result message?
The reader in filebeat applies the max_bytes setting after multiline. The multiline reader joins/normalizes multiline events by joining the actual lines using a single newline symbol \n.
Ok, but it doesn't change anything. The reason is to get in logstash/es after filebeat not more than exactly set in max_bytes size of message. Now I can't rely on that size and have to reduce it a little.
Hm... maybe I didn't really understand your issue. max_bytes is already applied after mulitline processing. The message contents is not be bigger then max_bytes when being passed to the output. Only reason the message output can become bigger is some characters requiring some special (multi-byte) encoding. Plus, beats send additional meta-data in a json document with the actual contents. The full event will be bigger then the original log line.
It is exactly \n symbols. 2 bytes because of printable symbols instead of 1 byte in source message. But if I set max_bytes: 6144 I want to have 6144 bytes max message in logstash. Not 6145, not 6146, etc (depends on number of new lines). Sending 10K bytes message to filebeat and get 6156 bytes in logstash. Replacing all \n by \ and have exactly 6144 bytes.
It is the JSON encoding escaping special symbols. This is on network level only. Logstash does decode the message, resolving these special symbols -> After decoding \n takes only 1 byte.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.