Logstash azure log analytics output - Failed to flush outgoing items - block in start_workers

Hi,

I am using the redhat openshift logging operator provided fluentd forward to forward my logs to a Logstash instance, and using the azure log analytics output plugin to forward the logs to an Azure Log Analytics Workspace.
I am seeing the following Warning continuously in my Logstash logs:

May 18 09:48:53 cbxrelaysrvsit1 logstash[39503]: [2022-05-18T09:48:53,219][WARN ][logstash.outputs.azureloganalytics][main][43328c4a9749fe314828ca147d3b05e72c47d0cd340a51d239257c16fca6eac1] Failed to flush outgoing items {:outgoing_count=>1, :exception=>"JSON::GeneratorError", :backtrace=>["json/ext/GeneratorMethods.java:79:in `to_json'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/microsoft-logstash-output-azure-loganalytics-1.0.0/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb:41:in `flush'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/buffer.rb:219:in `block in buffer_flush'", "org/jruby/RubyHash.java:1415:in `each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/buffer.rb:159:in `buffer_receive'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/microsoft-logstash-output-azure-loganalytics-1.0.0/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb:28:in `add_event_document'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/microsoft-logstash-output-azure-loganalytics-1.0.0/lib/logstash/outputs/microsoft-logstash-output-azure-loganalytics.rb:85:in `block in multi_receive'", "org/jruby/RubyArray.java:1809:in `each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/microsoft-logstash-output-azure-loganalytics-1.0.0/lib/logstash/outputs/microsoft-logstash-output-azure-loganalytics.rb:78:in `multi_receive'", "org/logstash/config/ir/compiler/OutputStrategyExt.java:143:in `multi_receive'", "org/logstash/config/ir/compiler/AbstractOutputDelegatorExt.java:121:in `multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:295:in `block in start_workers'"]}

I am also using Filebeat in some other VMs, and am able to aggregate those logs to Azure Log Analytics, but facing this issue after using fluent input.
The following is my Logstash pipeline.conf:

input {
      beats {
          port => "5044"
      }
      tcp {
        codec => fluent
        port => 4000
      }
  }

filter {
  mutate {
    copy => { "json_log_s" => "Message" }
   }
 }

output {
if "PTE" in [tags]{
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "PTE_OCP"
    }
  }
else if "test" in [tags]{
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "TEST"
    }
  }
else if "SIT2" in [tags] {
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "SIT2_OCP"
    }
  }
else if "SIT" in [tags]{
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "SIT_OCP"
    }
  }
else if "UAT" in [tags]{
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "UAT_OCP"
    }
  }
else {
    microsoft-logstash-output-azure-loganalytics {
      workspace_id => #<your-workspace-id> 
      workspace_key =>  # <your workspace key>
      custom_log_table_name => "remrep_ocp"
    }
  }
}

Could you please help me understand this issue?

Best Regards,
Pavan

The to_json is finding something in the message that it does not know how to encode. JSON is unicode. If your message has some other character encoding then it may contain characters (or NaN) that are not valid Unicode, in which case this exception will occur.

You need to find out what encoding the source of your events is using. You can force the encoding to UTF-8 in a ruby filter once you know that. This may help you understand the problem.

Hi @Badger

Thank you very much for that reply. I think you've pushed me in the right direction. I will check the encoding of my logs.
I have 3 questions from your response:

  1. From the solution in stackoverflow link, I would need to identify the corresponding for ISO-8859-1 that I am using in my logs, correct?
"iPhone\xAE".force_encoding("ISO-8859-1").encode("UTF-8")
#=> "iPhone®"
  1. Would I apply this on each one of the characters that are not encoded in JSON or can I do it for the message field in it's entirety?
  2. Where exactly do I need to apply this ruby filter?

Best Regards,
Pavan

Yes, you need to find what encoding your message is in. The ruby filter would be something like

ruby {
    code => ' event.set("message", event.get("message").force_encoding("xxx").encode("UTF-8") '
}

@Badger I am also seeing this in the start up of my Logstash logs:

[2022-05-19T09:53:32,292][ERROR][logstash.codecs.fluent   ][main][4fe2598517a75ecc0668a0a6e7a9f6e9c056628fcac106ac2ef5213729b2555f] Fluent parse error, original data now in message field {:error=>#<NoMethodError: undefined method `merge' for 1:Integer>, :data=>102}
[2022-05-19T09:53:32,303][ERROR][logstash.codecs.fluent   ][main][4fe2598517a75ecc0668a0a6e7a9f6e9c056628fcac106ac2ef5213729b2555f] Fluent parse error, original data now in message field {:error=>#<TypeError: can't convert nil into an exact number>, :data=>"pod_ip"}
[2022-05-19T09:53:32,309][ERROR][logstash.codecs.fluent   ][main][4fe2598517a75ecc0668a0a6e7a9f6e9c056628fcac106ac2ef5213729b2555f] Fluent parse error, original data now in message field {:error=>#<TypeError: can't convert nil into an exact number>, :data=>"host"}

Are these the characters that are causing me trouble?

Best Regards,
Pavan

You would need to look at the [message] field (as the error message says) to determine what the data looks like, and why the codec cannot process it.

Hey @Badger ,

I saw this GIT post that said to remove the codec = fluent from my logstash inputs. When I did that change, I am able to receive the logs, but it has all the metadata that fluent sends in the Message field itself.
Attaching my logstash inputs section:

input {
      beats {
          port => "5044"
      }
      tcp {
        port => 4000
      }
  }

I am sending this the log to Azure Log Analytics workspace output. I am seeing the Fluentd's metadata as well in the Message itself. I have highlighted the actual message in the below log snippet.

A sample log message that I received that was previously throwing the parse error:

\x93هkubernetes.var.log.pods.cibc-iso-sit1_digital-registry-3-nt8xg_83abd5ef-a808-461f-bb83-e543bc8693a1.digital-registry-java-service.0.log\xDB\u0000\u000E\u0006̒\xCEb\x86K\x8C\x8B\xA6docker\x81\xACcontainer_id\xD9@9c8f7f00e4c2ae853111a36fe7ae915593345ac5082c73c717bc3cbfac4e7740\xAAkubernetes\x8C\xAEcontainer_name\xBDdigital-registry-java-service\xAEnamespace_name\xADcibc-iso-sit1\xA8pod_name\xB8digital-registry-3-nt8xg\xAFcontainer_image\xD9Uimage-registry.openshift-image-registry.svc:5000/cibc-iso-sit1/digital-registry:8.0.2\xB2container_image_idٗimage-registry.openshift-image-registry.svc:5000/cibc-iso-sit1/digital-registry@sha256:f7c87983cc55520e0d40f5d19109f3933ffcaaa5464980f174a4755394096cb3\xA6pod_id\xD9$83abd5ef-a808-461f-bb83-e543bc8693a1\xA6pod_ip\xAC10.131.2.185\xA4host\xACisositwrk001\xA6labels\x83\xAAdeployment\xB2digital-registry-3\xB0deploymentconfig\xB0digital-registry\xA4name\xBBdigital-registry-deployment\xAAmaster_url\xBEhttps://kubernetes.default.svc\xACnamespace_id\xD9$5cd3f450-e872-4599-8eee-9c5ca31508e3\xB0namespace_labels\x81\xBBkubernetes_io/metadata_name\xADcibc-iso-sit1\xA7 **message** \xC4\xE32022-05-19 13:52:12,617 DEBUG [http-nio-8761-exec-1] com.netflix.eureka.registry.AbstractInstanceRegistry (getApplicationsFromMultipleRegions:737) - Fetching applications registry with remote regions: false, Regions argument []\xA5level\xA5debug\xA8hostname\xACisositwrk001\xB1pipeline_metadata\x81\xA9collector\x85\xA7ipaddr4\xAC172.16.24.37\xA9inputname\xB5fluent-plugin-systemd\xA4name\xA7fluentd\xABreceived_at\xD9 2022-05-19T13:52:12.620695+00:00\xA7version\xAC1.14.5 1.6.0\xA9openshift\x81\xA6labels\x81\xA9clusterId\xAEremrep-nonprod\xAA

Is there a way I can parse this with Logstash itself?

Best Regards,
Pavan

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.