Large number of Agent errors/missing data

Hello

We have elastic agent with security policy(elastic defend) enabled and configured. When testing I can see that test file(EICAR) is detected by agent, but no alert shows in Kibana. I can see alert logged to syslog, but not in Index .ds-logs-endpoint.alerts-default-*

Agent output is set to logstash.
Logstash reports

[INFO ][org.logstash.beats.BeatsHandler] [local: LOGSTASH_IP:5044, remote: AGENT_IP:55852] Handling exception: java.lang.RuntimeException: Unable to parse beats payload (caused by: com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (20054016) exceeds the maximum length (20000000))

Any idea how to solve this?

This is with trial premium license.

Hi @gyterpena

That's certainly unexpected. Since Endpoint documents need to pass through Logstash in your set up, anything that causes them to be dropped would prevent them from appearing in Elasticsearch/Kibana, as you're seeing.

  1. First off, it's possible the error you are seeing is unrelated to the issue causing the Endpoint alert to not appear in Elasticsearch. That size (20MB) is far bigger than any expected Endpoint alerts. Can you make sure that your Logstash set up is forwarding all indices listed below?
    logs-endpoint.alerts-*
    .logs-endpoint.action.responses-*
    .logs-endpoint.diagnostic.collection-*
    logs-endpoint.events.file-*
    logs-endpoint.events.network-*
    logs-endpoint.events.process-*
    logs-endpoint.events.registry-*
    logs-endpoint.events.library-*
    logs-endpoint.events.security-*
    logs-endpoint.events.api-*
    logs-endpoint.events.volume_device-*
    metrics-endpoint.metadata-*
    metrics-endpoint.policy-*
    metrics-endpoint.metrics-*
    .logs-endpoint.heartbeat-*
  1. One way to see if those errors are caused by Endpoint documents or a Beat is to test again with just Endpoint (Elastic Defend) added to the Agent policy. I've never seen Endpoint documents (or bulk sends) close to 20MB, I wonder if those exceptions are actually from a Beat. (That's not to say it still wouldn't be an issue to fix, but understanding the source of those exceptions would help).

  2. A workaround for this issue in Logstash has been fixed as of 8.12.0. The workaround is to start Logstash with an increased max using the JVM option -Dlogstash.jackson.stream-read-constraints.max-string-length=200000000 . This is the easiest solution to those exceptions. Does that solution work for you?

Let me know.

Hello @ferullo

  1. Logstash output is set as below(pipeline is elastic agent specific, we do run second pipeline for beats(to be migrated to agent)), Listed indices do exist and have some older data. So some data did make it through in the past.
output {
       elasticsearch {
         id => "out_agent_elasticsearch"
         http_compression => true
         ssl_enabled => true
         ssl_verification_mode => none
         hosts => [ "https://ES1","https://ES2", "https://ES3" ]
         manage_template => false
         user => "${LWU}"
         password => "${LWP}"
         ssl_certificate_authorities => "/etc/logstash/elasticsearch-ca.pem"
       }
}

Policy output advanced section is as below(we had issue where agent send more than 500 events to logstash in one bulk request and this also failed)

[elastic_agent.endpoint_security][notice] BulkQueueConsumer.cpp:260 The number of left over documents to be sent (500) is equal the max setting (500)
loadbalance: true
ssl.verification: none
worker: 1
bulk_max_size: 250
slow_start: true
pipelining: 0
  1. Policy had Elastic Defend(in detect mode, with all options enabled) and Osquery manager(this is now removed) issue persists.
    Under agent logs following corresponding? error is logged under elastic_agent.endpoint_security event.dataset
elastic_agent.endpoint_security
[elastic_agent.endpoint_security][error] LogstashClient.cpp:651 Attempt to read data from Logstash server at logstash(01|02):5044, connection was closed
  1. Setting
    -Dlogstash.jackson.stream-read-constraints.max-string-length=200000000
    or
    -Dlogstash.jackson.stream-read-constraints.max-string-length=210000000

Results in (tested on logstash 8.12.1, 8.12.2, Agent 8.12.2, ES 8.12.2)

[2024-02-28T07:38:27,471][INFO ][org.logstash.beats.BeatsHandler] [local: Logstash_IP:5044, remote: Agent_IP:41614] Handling exception: org.logstash.ackedqueue.QueueRunt
imeException: data to be written is bigger than page capacity (caused by: org.logstash.ackedqueue.QueueRuntimeException: data to be written is bigger than page capacity)
[2024-02-28T07:38:27,472][WARN ][io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means t
he last handler in the pipeline did not handle the exception.
org.logstash.ackedqueue.QueueRuntimeException: data to be written is bigger than page capacity
        at org.logstash.ackedqueue.Queue.write(Queue.java:428) ~[logstash-core.jar:?]
        at org.logstash.ackedqueue.ext.JRubyAckedQueueExt.rubyWrite(JRubyAckedQueueExt.java:132) ~[logstash-core.jar:?]
        at org.logstash.ext.JrubyAckedWriteClientExt.doPush(JrubyAckedWriteClientExt.java:60) ~[logstash-core.jar:?]
        at org.logstash.ext.JRubyWrappedWriteClientExt.lambda$push$0(JRubyWrappedWriteClientExt.java:113) ~[logstash-core.jar:?]
        at org.logstash.instrument.metrics.timer.ConcurrentLiveTimerMetric.time(ConcurrentLiveTimerMetric.java:47) ~[logstash-core.jar:?]
        at org.logstash.ext.JRubyWrappedWriteClientExt.lambda$executeWithTimers$2(JRubyWrappedWriteClientExt.java:148) ~[logstash-core.jar:?]
        at org.logstash.instrument.metrics.timer.ConcurrentLiveTimerMetric.time(ConcurrentLiveTimerMetric.java:47) ~[logstash-core.jar:?]
        at org.logstash.ext.JRubyWrappedWriteClientExt.lambda$executeWithTimers$3(JRubyWrappedWriteClientExt.java:148) ~[logstash-core.jar:?]
        at org.logstash.instrument.metrics.timer.ConcurrentLiveTimerMetric.time(ConcurrentLiveTimerMetric.java:47) ~[logstash-core.jar:?]
        at org.logstash.ext.JRubyWrappedWriteClientExt.executeWithTimers(JRubyWrappedWriteClientExt.java:148) ~[logstash-core.jar:?]
        at org.logstash.ext.JRubyWrappedWriteClientExt.push(JRubyWrappedWriteClientExt.java:113) ~[logstash-core.jar:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.codec_callback
_listener.RUBY$method$process_event$0(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/codec_callback_listener.rb:
23) ~[?:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.patch.RUBY$blo
ck$accept$1(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/patch.rb:10) ~[?:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_codec_minus_plain_minus_3_dot_1_dot_0.lib.logstash.codecs.plain.RUBY$method$decode$0(/usr
/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-codec-plain-3.1.0/lib/logstash/codecs/plain.rb:54) ~[?:?]
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:165) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185) ~[jruby.jar:?]
        at org.jruby.ir.targets.indy.InvokeSite.failf(InvokeSite.java:866) ~[jruby.jar:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.patch.RUBY$met
hod$accept$0(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/patch.rb:9) ~[?:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.patch.RUBY$met
hod$accept$0$__VARARGS__(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/patch.rb:8) ~[?:?]
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112) ~[jruby.jar:?]
        at org.jruby.RubyClass.finvokeWithRefinements(RubyClass.java:522) ~[jruby.jar:?]
        at org.jruby.RubyBasicObject.send(RubyBasicObject.java:1707) ~[jruby.jar:?]
        at org.jruby.RubyBasicObject$INVOKER$i$send.call(RubyBasicObject$INVOKER$i$send.gen) ~[jruby.jar:?]
        at org.jruby.ir.targets.indy.InvokeSite.fail(InvokeSite.java:777) ~[jruby.jar:?]
        at usr.share.logstash.vendor.jruby.lib.ruby.stdlib.delegate.RUBY$method$method_missing$0(/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/delegate.rb:87) ~[?:?]
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:175) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:222) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:228) ~[jruby.jar:?]
        at org.jruby.runtime.Helpers$MethodMissingWrapper.call(Helpers.java:642) ~[jruby.jar:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.message_listen
er.RUBY$method$onNewMessage$0(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/message_listener.rb:52) ~[?:?]
        at usr.share.logstash.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_beats_minus_6_dot_8_dot_0_minus_java.lib.logstash.inputs.beats.message_listen
er.RUBY$method$onNewMessage$0$__VARARGS__(/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-input-beats-6.8.0-java/lib/logstash/inputs/beats/message_listener.rb:33)
 ~[?:?]
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139) ~[jruby.jar:?]
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112) ~[jruby.jar:?]
        at org.jruby.gen.LogStash$$Inputs$$Beats$$MessageListener_1504800100.onNewMessage(org/jruby/gen/LogStash$$Inputs$$Beats$$MessageListener_1504800100.gen:13) ~[?:?]
        at org.logstash.beats.BeatsHandler.channelRead0(BeatsHandler.java:56) ~[logstash-input-beats-6.8.0.jar:?]
        at org.logstash.beats.BeatsHandler.channelRead0(BeatsHandler.java:12) ~[logstash-input-beats-6.8.0.jar:?]
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[netty-codec-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) ~[netty-codec-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:61) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:425) ~[netty-transport-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[netty-common-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-common-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.100.Final.jar:4.1.100.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.100.Final.jar:4.1.100.Final]
        at java.lang.Thread.run(Thread.java:840) [?:?]

Edit:
setting
queue.page_capacity: 128mb
in logstash.yml gets rid of above error and I can see test alerts in kibana.

We have mapper-size enabled, but .*(system) indices don't have this populated so I can not find out what event was causing this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.