Logstash is sending garbage value to ES

Himanshii · August 27, 2020, 3:42pm

I'm using Logstash to read data from Oracle DB & ingesting it into a Kafka topic, then reading the data from the topic & ingesting it into ES (basically Oracle -> Kafka, Kafka -> ES). The query used is capturing data for a specific timestamp which fetches 52 rows, everything works as expected till here but after inserting all the rows, Logstash starts inserting below garbage value:

2020-08-27T13:40:49.122Z %{host} 2020-08-27T13:40:48.799Z %{host} 2020-08-27T13:40:48.612Z %{host} 2020-08-27T13:40:48.323Z %{host} 2020-08-27T13:40:48.123Z %{host} 2020-08-27T13:40:47.723Z %{host} 2020-08-27T13:40:47.220Z %{host} %{message}

Configs that I'm using:

Oracle -> Kafka:

    input {
      jdbc {
        jdbc_driver_library => "/usr/share/logstash/ojdbc8.jar"
        jdbc_driver_class => "Java::oracle.jdbc.OracleDriver"
        jdbc_connection_string => "jdbc:oracle:thin:@redacted"
        jdbc_user => "redacted"
        jdbc_password => "redacted"
        tracking_column => "timestamp"
        use_column_value => true
        tracking_column_type => "timestamp"
        jdbc_default_timezone => "Asia/Kolkata"
        schedule => "*/10 * * * *"
        statement_filepath => "redacted"
      }
    }

    output {
      kafka {
       topic_id => "audit_data"
       bootstrap_servers => "redacted"
       acks => "0"
       jaas_path => "/usr/share/logstash/jaas.conf"
       sasl_kerberos_service_name => "kafka"
       kerberos_config => "redacted"
       codec => plain
       security_protocol => "SASL_PLAINTEXT"
      }
    }

Kafka -> ES

    input {
        kafka{
    #    group_id => "logstash"
        jaas_path => "/usr/share/logstash/jaas.conf"
        sasl_kerberos_service_name => "kafka"
        kerberos_config => "redacted"
        auto_offset_reset => "latest"
        topics => ["audit_data"]
        codec => plain
        bootstrap_servers => redacted
        security_protocol => "SASL_PLAINTEXT"
    #    type => "syslog"
        decorate_events => true
        }
    }


    output {
        #stdout { codec =>  "json"}
        elasticsearch {
            hosts => ["redacted"]
            user => "redacted"
            password => "redacted"
            cacert => ["redacted"]
            action => "index"
            index => "kafka_logstash"
        }
    }

I checked Kafka topic data from consumer console & could see the garbage values continuously flowing in, so I removed the old data from topic(set the retention to 1000ms), used the same query & config parameters, this time directly from Oracle to ES & it worked fine without any garbage value. Below is the config I used:

    input {
      jdbc {
        jdbc_driver_library => "/usr/share/logstash/ojdbc8.jar"
        jdbc_driver_class => "Java::oracle.jdbc.OracleDriver"
        jdbc_connection_string => "jdbc:oracle:thin:@redacted"
        jdbc_user => "redacted"
        jdbc_password => "redacted"
        tracking_column => "timestamp"
        use_column_value => true
        tracking_column_type => "timestamp"
        jdbc_default_timezone => "Asia/Kolkata"
        schedule => "*/10 * * * *"
        statement_filepath => "/usr/share/logstash/oracle.sql"
      }
    }

    output {
        #stdout { codec =>  "json"}
        elasticsearch {
            hosts => ["redacted"]
            user => "redacted"
            password => "redacted"
            cacert => ["redacted"]
            action => "index"
            index => "test_kafka_logstash"
        }
    }

Please suggest how we can fix this.
Thanks!

Himanshii · August 28, 2020, 2:29pm

@stephenb @Badger Could you please suggest something..that I might've missed?

stephenb · August 29, 2020, 5:10pm

Hi @Himanshii

1st is is not really best practice / polite to call on specific people to help with questions @Badger and I (Event though I am an Elastic Team member) are volunteers on this forum and participate in our free time.

2nd I would check to see if the Oracle -> Kafka logstash pipeline is actually writing the bad lines you can put another output in output section, that way you will see if logstash is actually writing those lines or is it something in Kafka.

stdout {codec => rubydebug}

Himanshii · August 29, 2020, 8:37pm

@stephenb Sorry I wasn't sure how this works, thought my question got skipped, noted for future reference Thanks for your help! Appreciate it!
I tried the below config in output along with Kafka:

     file {
       path => "/tmp/logstash-kafka.txt"
       codec => rubydebug
      }

I had opened consumer console along with this file & could see garbage data flowing in Kafka, & correct data in the logstash-kafka.txt. Same happened when I tried Kafka along with ES, data was getting correctly ingested in ES via Logstash in the absence of Kafka in between.

stephenb · August 29, 2020, 8:51pm

Hi @Himanshii no worries

I am not sure what is happening... Looks like ingest on the Kafka side (I am not a Kafka expert)

Also I would write the file with the same code => plain to make sure all things are the same.

Also just to set expectations... Not all questions in the forum actually get answered, there are too many... the community does it's best but no guarantee.

If you need commercial support I would suggest perhaps considering a commercial license which comes with support.

Himanshii · August 29, 2020, 9:12pm

@stephenb I checked the file, data seemed consistent if that's what you're suggesting here:

Another thing I observed in Kafka console was that below garbage value was in 52 lines(occured 52 times, which is the number of rows fetched from oracle), & then it stopped sending anymore data/garbage:

2020-08-29T20:57:32.574Z %{host} %{message}

Thanks for the suggestion, we're considering commercial license as our requirement & cluster is growing, but that might take time..

system · September 26, 2020, 9:12pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash is not sending the data to elasticsearch Logstash	3	665	March 2, 2021
Error in sending data from kafka to elasticsearch via logstash Logstash	6	954	August 13, 2019
Logstash elasticsearch to kafka stops after being started Logstash	3	1699	January 16, 2018
Kafka to elasticsearch not going data Logstash	1	405	February 21, 2018
Issue Transfer Data to ElasticSearch Logstash	5	300	April 28, 2020

Logstash is sending garbage value to ES

Related topics