[pgsql] Flow stop

Hi,

I have many installation of packetbeat connected to elasticsearch. Sometimes on a few vm, the packetbeat flow stop with this mistake :

2017-01-30T06:39:30+01:00 ERR Pgsql invalid column_length=4294967295, buffer_length=4, i=3
2017-01-30T06:39:30+01:00 ERR Pgsql invalid column_length=4294967295, buffer_length=4, i=3
2017-01-30T06:39:30+01:00 WARN Pgsql parser expected extended query message, but received command of type 80
2017-01-30T06:39:37+01:00 INFO Non-zero metrics in the last 30s: pgsql.unmatched_responses=348 libbeat.es.publish.read_bytes=13257 libbeat.es.publish.write_bytes=481378 libbeat.es.published_and_acked_events=783 libbeat.es.call_count.PublishEvents=31 libbeat.publisher.published_events=776 tcp.dropped_because_of_gaps=2 libbeat.publisher.messages_in_worker_queues=261
2017-01-30T06:40:07+01:00 INFO No non-zero metrics in the last 30s
2017-01-30T06:40:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-30T06:41:07+01:00 INFO No non-zero metrics in the last 30s
2017-01-30T06:41:37+01:00 INFO No non-zero metrics in the last 30s
2017-01-30T06:42:07+01:00 INFO No non-zero metrics in the last 30s

My config :

packetbeat.interfaces.device: any

packetbeat.flows:
timeout: 30s

packetbeat.protocols.pgsql:
enabled: true
ports: 6432

output.elasticsearch:
hosts: xxx:9200
template.enabled: true
template.path: packetbeat.template.json

Packetbeat is connecting to pgbouncer

Other vm continue to send data normaly.
I restart the packetbeat process and the flow return again data

Do you have an idea of the problem ?
Thanks

looks like packetbeat is seening a message it can not handled. No idea what command type 80 is, though. It can happen the protocol analyzer not being in sync with the actual network stream. This can happen either on start of packetbeat (if it starts to capture in middle of ongoing transaction), or after some packet loss, e.g. this tcp.dropped_because_of_gaps=2 indicates packet loss. Have you tried to run packetbeat with af_packet?

btw. if parser detects error, it drops the connection and restarts capturing the connection. That is a restart of packetbeat should not be necessary.

Thanks for you answer.

I am trying to run packetbeat with af_packet flag.
The flow run normaly since this morning. I'll see tomorrow if it's still ok

Indeed, i not understand why packetbeat stop flux and can't restart automatically...Is a defaut parametres ?

Sorry, but I don't fully understand what you're saying. There is no parameter to restart a protocol analyzer. It just drops some state and continues processing. On of the problems with packet-loss is too much state potentialy piling up in memory. This requires more time to clean up state. Plus, whenever state needs to be dropped, we have to resync the TCP stream. For resync the parser has to try to parse the TCP stream in the middle (and potentially throw away state, as not yet in sync), adding more temporary state to be cleaned up. That is packet loss can make problems worse due to additional processing required (it's a vicious cycle packetbeat might not be able to escape from). By restarting packetbeat you drop all state, which might help for some time, until the vicious cycle begins again (all passive deep packet analyzers face similar problems).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.