I have a grok filter parsing logs from a DB2 AIX server, and the logs are sent to Logstash in the .txt format, the file contains a lot of log messages, separated by a timestamp parameter
And i receive a timeout message like "Value too large to output (XXXX bytes)..." for a few logs messages, and it's nothing wrong with my grok pattern, because i also put the same configuration in Grok Debugger and it's works fine
So, my question is, does grok, have a size limit for the "message" field?
And I suspect of grok's attributes because Grok Debugger does not return any errors and the "message" field that shows up in the terminal when the timeout message comes in, shows that not the entire log message has been captured by grok
It's like that the message was cut in the half
Please, help me
Edit1: Most of The messages are correctly processed, only some of them shows up the timeout message
grok does not directly limit the size of the message field. I wouldn't assume that a grok debugger matches in the same way that grok does.
The "Value too large to output" is just the filter not wanting to log a very long log message, so it only prints the first 256 bytes of the field. It is still matching against the entire field.
Without seeing the pattern you are matching it is hard to suggest much. Read this blog on grok performance. Anchor patterns whenever possible, avoid multiple GREEDYDATA in the same pattern. See if DATA and GREEDYDATA can be replaced by more specific patterns (e.g. NOTSPACE, or a custom pattern that consumes text up to a delimiter).
I was going to point to UNIXPATH as an example of pattern to avoid since it can be monstrously expensive. However, it turns out this was fixed last year. The change basically just removed the alternation!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.