Logstash with Kafka compression and decompression

Hi, I'm new to Elastic and very interested in this pipeline:
Data Sources -->LogStash --> Kafka -->LogStash --> ElasticSearch, where the first LS specifies gzip data compression with Kafka output plugin and the second LS enriches data with filter plugins.
I assume a gzip codec plugin is required on the second LS in order to process the data, does that mean decompression happens on the second LS? or on the final ES? Also, where does the compression actually happen, on the first LS or Kafka?

Thanks in advance!

Anyone have experience on logstash kafka compressed data transfer? Thanks!!!

No it is not. I have logstash reading from kafka, discarding 99% of the data and writing with gzip compression to another kafka instance. Another logstash instance reads that topic and it does not specify compression on the input.

Thanks Badger, looks like you also have logstash -> kafka -> logstash.
I want to do some data processing work on the second logstash, in this case, I assume a gzip codec is required. Do you have the same data processing work running on the second logstash?

Do you happen to know where the compression happens, on the first logstash or on kafka

You do not require a gzip codec on the second logstash instance. The kafka message header indicates whether the message is compressed, so the input plugin will know whether to decompress.

I believe the kafka producer (i.e. logstash) is expected to do the compression but I am not certain.

Cool...This is interesting, then I guess elasticsearch can handle gzip somehow

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.