Log file decode error

I'm using gz_lines codec in logstash referring this post Read a gzip file with gzip_lines codec - #12 by TC_Huang

But end up with

13:26:37.747 [[main]<file] DEBUG logstash.inputs.file - Received line {:path=>"/data1/logs/logs/20170514/access.log.20170514_501.log.gz", :text=>"\x1F\x8B\b\x00a\xB9\x17Y\x00\x03\xEC\xBD\xD9r\x1CW\x92-\xFA|\xEEW\xC0\xF4Pv\x8E\x9Dfj\xFB\xB0'\x94\xC9\xCA\xA8\xA1%u\x89\x14%\xB1J\xAA~\x81\xED\xB1\x04k\x8AdsPu\xF5\xB5\xFB\xEF\xD7=\x12\td$\xC1\x8CH\xE4@\x8Aj\x8Dd\x12@DF\xBA\xAF\xBD|Z\x8E\x06\xFC=c\xEF\x01?6\xF1\xDC\xBAs\x13\xFF\xAF\t\xE7\xC6\xFC\xAF\xF2\xEB\xE2\xF9\x8Bg\xF5uy\xB5H\xA5\xB4\x97/\x17\xBF<\xCB\x97O\xDA\xFF\xFA\x7F?z\x91J\xFB\xE8\xFC\xA3\xF4$\xBF\xFE\xE5\xA3\x7F\xF9\xE8\xD7\xCB\xDA\x9E]\\\xD6\x8F\xCE\x03a\x8C\xC1\xB9\x7F\xF9\xE8\xF2\xB9|\x01\x04Z \x9A\x05\xFB\x85\xB7\xF2\x85\xA5>\x95W\xE5\xBF\xF7\xFE\xD1\xF2\xBD\xFF|\xBA(\xCF\x9E<{\xF1\xEB\xA2<\x95?L\xE5\x95\xFC\xE1\xEB\xE75\xBDj\xFA\xDBz\xF1\xEA\x9F\xCF\xF52\xF7\xEB\x17\xFFU~NO\xFF>\xBC\xFC\xF7\xF6T\xBF\xEE\xC1\xB3\xFF\xBE|\xF2$}l\x17\xE6\xEC\x7F\x7Fs\xF9\xF4\xF5\x7F\xFD\xF1\xEC\xFE\xD3\xFA\xE2\xD9e=\xB3\v\xF8\xE3\xD9W\x7F\xB9\xFF\xE3\x17_\x9F=\xBE\xFF\xE5\xBD\xC7\xDF\x18s\xF6\xE9\xEB\xCB'\xF5\xE3\xE5\xAB\xAB\x17\xFFx\xF6\x8F_\xFF\xCF\xD9\xFD\xE7\xCF\x9F\xB4\x1F[\xFE\xF3\xE5\xAB\x8F-\xF9\x05\xB9\xB3\xFF\xFD\xE7\xAF\x1E?\xF8\xE6_\xCE\x9E\\\xFEG;\xFB\xB2\x95\xFFx\xF6\x7F\xCE\xFE\xDA^\xBC\xBC|\xF6\xF4c\x96+~\xF6\xF3\x8Bg\xBF4\xF9\xEA\x85Y\xA0\x0Fv\xC1\xF1\xEC\xC1\xF0t\xCE\x1E|\xF7\xDD\xA7/\x9E\xFD\xE3e{\xF1\xB1[\xE0\xD9\xE3O\x7F\xF8\xD80\xC9S8\xFB!\xF5\xF4\xE2ru\x89\a\x97\xE5\xC5\xB3\a\xF2X\x9B\xBC1\xFDZyTa\x11\xE5N\x1F\xB6W\x8F\xE5\x9D\x7F\xFC\xE3\xD7\xFF\xFA\xF5\xD97\xF2\xBE_\xCB\x9B\xFE\xF8\xBF\x7F\xBE\xF8\xEC\xA1>\x80\xD7\xAF~~\xF6\xE2\xE2\xBF\x9F=\x95gc\xAE\x7F\xFF\xA2\xFD\xFD\xF2\xE5\xAB\xF6\xA2\xD5\x8B\xA4\xCF\a\x87\xCF\x15\xEF\x81=\x038\x87x\x8E^\xBE\xF9\xF9\xB3\x97\xAF\x86\x0FJ\xBE\xF1\xF9\xEB\xFC\xE4\xF2\xE5\xCF\xE3oPC8\x13C`\xF9\x1E#\xDF\xF0\xA2u\xF9\x99/\xE4\xCF\xAF\x7F\xF3\xB4\xB4\xD5\x8Fx)\xB7\xAF\xBF\xFC\b\\H\xAE\x85B!Ev>\xA4j\x93\xB7\xD9r-\xA6\xE7\x14\xE4\x9B\xEB\xEB\x17\xE9\x95<\xBF\xF5\x9B\xBE|y\xF1\xFC\xC5e{\xF9jx\xB1>\xFB\xC7\xD3'\xCFn>\xF7_\xFE\x99\x9E?\xBFy\xC7\xAF\xEBp\xA9R\xC0woK\x8B\x81-\xA1\xFC\xF9\xCBW\xE9\xD5\xEB\x97\x17lxy\x9BW_\xFF\xEB\xF2\xF3R;}Z\xEF\xC9\xA7\xB6 \xF9\xC3_\x9E=\xA9\xAB\xDB\x7F\xFD\xE2\x89\xFC\xE9\xCF\xAF^=?\xFF\xF8\xE3\xC1\x88o\f\xF2\xE3\xE7O\xD2??^\xD9\xF3\x9F^\x7F\xB2q\xDD?<\xFF\xE4\x97\x7F\xFE\xA1|\x82\x7F(\xE9\xD5'\xAF\x9F\xEB\x8D\xFF\xA1\x8BU|\xF2\xF7\x17\xCF^?\xFFE\x9E\x8C|j\xCBW^\xFE\x9C^\xB4\x9B\e{\xF6ryO\xFAy\xC8\xD7\\\xFC\xC7\xE5S}c\xBF\\>\xBD\x1C\x1E\xF2\x7F\xBEnW\x9F\xD2G%Gh\xA8\xCF\x12"}
13:26:37.750 [[main]<file] DEBUG logstash.codecs.gziplines - config LogStash::Codecs::GzipLines/@id = "446e269adcc3464cce1afbe70bc122fe28523aaa-1"
13:26:37.750 [[main]<file] DEBUG logstash.codecs.gziplines - config LogStash::Codecs::GzipLines/@enable_metric = true
13:26:37.750 [[main]<file] DEBUG logstash.codecs.gziplines - config LogStash::Codecs::GzipLines/@charset = "UTF-8"

I have tried several codecs, including: utf-16bl. utf-16, ascii, but none of these worked.

You reading the gz file directly with the file input because it says Received line {:path=>"/data1/logs/logs/20170514/access.log.20170514_501.log.gz".

The gz_lines codec works in an indirect way. The file input should read a file that has a list of gz files in it e.g. all_the_zipped_logs.txt.
If setup correctly you should see something similar to Received line {:path=>"/data1/logs/logs/all_the_zipped_logs.txt", :text=>"/data1/logs/logs/20170514/access.log.20170514_501.log.gz"}
The codec then should receive the path to the zipped file, open it and create events from each line in the zipped file.

NOTE there a few gotchas in this solution you should be aware of.

  1. If the zipped file is big, then LS will not respond to a shutdown signal while its reading the zipped file.
  2. You have used up your one codec limit - all further protocol decoding or multiline must be done in the filter stages - but the multiline filter is deprecated because it expects the events (each line in the zip file) to arrive in strict order and we could not guarantee that.

That's really helpful, thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.