Why does logstash send RST rather than FIN on client_inactivity_timeout

hi all

Logstash 5.4.1, filebeat 5.4.1, ubuntu, development machines (no other traffic beyond the below)

I have seen a number of threads about this log message in filebeat (port 8998 is the logstash port), sometimes caused by misconfig or throttling etc.

2017/06/19 02:48:50.308090 sync.go:85: ERR Failed to publish events caused by: write tcp 10.1.0.12:56052->10.1.0.8:8998: write: connection reset by peer
2017/06/19 02:48:50.308121 single.go:91: INFO Error publishing events (retrying): write tcp 10.1.0.12:56052->10.1.0.8:8998: write: connection reset by peer

In my particular case logs are quite infrequent thus the default 60sec client inactivity timer often fires way before the 5m filebeat inactivity timer. Here is how it looks at the filebeat host, packet #8 is the RST from logstash which sets the stage for the above error log when the next log message is ready to be sent @ packet #12 (rest of the the transaction is snipped)

abc@host5:/home/abc$ tshark -i ethGi1 -n "tcp port 8998"  
Capturing on 'ethGi1'
  1   0.000000    10.1.0.12 -> 10.1.0.8     TCP 74 56052 > 8998 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1651206007 TSecr=0 WS=128
  2   0.000164     10.1.0.8 -> 10.1.0.12    TCP 74 8998 > 56052 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=264028982 TSecr=1651206007 WS=128
  3   0.000207    10.1.0.12 -> 10.1.0.8     TCP 66 56052 > 8998 [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=1651206007 TSecr=264028982
  4   0.000839    10.1.0.12 -> 10.1.0.8     TCP 594 56052 > 8998 [PSH, ACK] Seq=1 Ack=1 Win=29312 Len=528 TSval=1651206007 TSecr=264028982
  5   0.000952     10.1.0.8 -> 10.1.0.12    TCP 66 8998 > 56052 [ACK] Seq=1 Ack=529 Win=30080 Len=0 TSval=264028982 TSecr=1651206007
  6   0.106094     10.1.0.8 -> 10.1.0.12    TCP 72 8998 > 56052 [PSH, ACK] Seq=1 Ack=529 Win=30080 Len=6 TSval=264029008 TSecr=1651206007
  7   0.106121    10.1.0.12 -> 10.1.0.8     TCP 66 56052 > 8998 [ACK] Seq=529 Ack=7 Win=29312 Len=0 TSval=1651206033 TSecr=264029008
  8  60.057794     10.1.0.8 -> 10.1.0.12    TCP 66 8998 > 56052 [RST, ACK] Seq=7 Ack=529 Win=30080 Len=0 TSval=264043996 TSecr=1651206033
  9 116.003717    10.1.0.12 -> 10.1.0.8     TCP 74 56130 > 8998 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=1651235008 TSecr=0 WS=128
 10 116.003884     10.1.0.8 -> 10.1.0.12    TCP 74 8998 > 56130 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=264057983 TSecr=1651235008 WS=128
 11 116.003929    10.1.0.12 -> 10.1.0.8     TCP 66 56130 > 8998 [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=1651235008 TSecr=264057983
 12 116.005091    10.1.0.12 -> 10.1.0.8     TCP 508 56130 > 8998 [PSH, ACK] Seq=1 Ack=1 Win=29312 Len=442 TSval=1651235008 TSecr=264057983

(filebeat running in -once mode, timeout 5 mins (default), a single 'new' log added to a watched file before starting filebeat, wait for the RST from logstash, then cat another single log line into the watched file, Logstash set to debug to stdout only)

I understand the behaviour of the client_inactivity_timeout, but why is this sent as a RST rather than a FIN - which would allow for a graceful teardown of the filebeat <> logstash connection and prevent the connection reset by peer error message.

thanks
-jeff

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.