Packetbeat - “ERR Failed to read integer reply: Expected digit”

moooofly · February 8, 2017, 9:52am

when analyzing pcap file for redis protocol with packetbeat, I get outputs as follow:

➜  packetbeat git:(master) ✗ ./packetbeat -c ./packetbeat.yml -e -I redis_xg-bjdev-rediscluster-1_prot-7101_20161222110711_20161222110721.pcap -E packetbeat.protocols.redis.ports=7101 -t
...
2017/01/11 09:42:37.664290 protos.go:89: INFO registered protocol plugin: amqp
2017/01/11 09:42:37.664309 protos.go:89: INFO registered protocol plugin: http
2017/01/11 09:42:37.664315 protos.go:89: INFO registered protocol plugin: mysql
2017/01/11 09:42:37.664320 protos.go:89: INFO registered protocol plugin: redis
2017/01/11 09:42:37.665784 beat.go:207: INFO packetbeat start running.
2017/01/11 09:47:45.218365 redis_parse.go:306: ERR Failed to read integer reply: Expected digit
2017/01/11 09:42:38.430211 sniffer.go:384: INFO Input finish. Processed 40644 packets. Have a nice day!
2017/01/11 09:42:38.430657 util.go:48: INFO flows worker loop stopped
2017/01/11 09:42:38.430709 logp.go:245: INFO Total non-zero values:  libbeat.publisher.published_events=8080 tcp.dropped_because_of_gaps=15 redis.unmatched_responses=15
2017/01/11 09:42:38.430722 logp.go:246: INFO Uptime: 909.957024ms
2017/01/11 09:42:38.430728 beat.go:211: INFO packetbeat stopped.
➜  packetbeat git:(master) ✗

The error message is "ERR Failed to read integer reply: Expected digit".

After analyzing with Wireshark in contrast, I find the root case behind this: When REDIS response is big enough and network is not good enough, "packet loss" might happen, which result in the ERROR above.

Snapshot in my test:

According to the sequence of packets, I find：

No.37642 - HMGET with key hr-e0acd6e0-4c21-4917-a676-c4fd8094f2aa:user

*10
$5
HMGET
$44
hr-e0acd6e0-4c21-4917-a676-c4fd8094f2aa:user
$5
18370
$5
52708
$1
0
$5
18370
$4
1117
$2
13
$3
153
$3
147

No.37642 - [TCP Previous segment not captured], this response is relative to HMGET's last two fields.

:{"id":153,"email":"xiaojiao.xie@xxx","work_code":"E000027","mobile":186xxxx0925,"name":".........","walle_id":77287,"status":6,"pinyin_name":"xxj","sex":1,"security_level":60,"certificate_type":0,"certificate_number":"42068xxxxxxxxx3733","created_at":1431550029000,"updated_at":1449228237000,"nchr_id":"0001A910000000002EQP"}}
$519
{"userId":147,"userBuList":[3175],"tagsList":[],"userBuRoleDto":[{"id":81524,"bu_id":3175,"bu_name":"............BU","role_id":859,"role_name":".........","user_id":147,"user_name":"......"}],"user":{"id":147,"email":"xin.jin@xxxx","work_code":"E000029","mobile":186xxxx5626,"name":"......","walle_id":56063,"status":6,"pinyin_name":"jx","sex":1,"security_level":70,"certificate_type":0,"certificate_number":"420106xxxxxx510","created_at":1431550030000,"updated_at":1449228115000,"nchr_id":"0001A910000000002ERE"}}

No.37648 - [TCP Fast Retransmission], just retransmit the lost data segments.

*8
$532
{"userId":18370,"userBuList":[4594],"tagsList":[],"userBuRoleDto":[{"id":120993,"bu_id":4594,"bu_name":"..................","role_id":924,"role_name":"......","user_id":18370,"user_name":"......"}],"user":{"id":18370,"email":"hui.yaobj@xxx","work_code":"E019529","mobile":137xxxx8281,"name":"......","walle_id":23156752,"status":6,"pinyin_name":"yh","sex":1,"security_level":20,"certificate_type":0,"certificate_number":"13098xxxxxx033","created_at":1438657893000,"updated_at":1449228019000,"nchr_id":"0001A910000000013PM5"}}
...
$526
{"userId":153,"userBuList":[3174],"tagsList":[],"userBuRoleDto":[{"id":53306,"bu_id":3174,"bu_name":"............","role_id":922,"role_name":"......","user_id":153,"user_name":"........."}],"user"

In this case, redis_parse.go module dose not work correctly, and packetbeat will stop running as soon as the ERROR happens.

so, I think it is a bug, or how can i fix it ? thanks in advance.

moooofly · February 8, 2017, 9:59am

BTW, once the Redis response is split from another position, the Error message might be different I think.

steffens · February 8, 2017, 12:33pm

the protocol analyzers in packetbeat must synchronize to the network stream. If they're not in sync, the parser might fail. In this case the internal state is dropped and a new parser instance is generated, in the hope of us being in sync with the next packet. As packet-loss might occur at any time, we have to drop the parser state in most cases and try to resync.

packetbeat stops running, because all packets have been send to the protcol analyzers. Not all events might be published yet. try -waitstop 10s to wait for 10 more seconds wether any events are stuck.

Without the original pcap and without being able to debug it myself I can not comment on any events you observed.

moooofly · February 9, 2017, 6:51am

I did some work on my original pcap file, and split it into several parts.

37642-37651_plrt_s0_pbErr.pcap is the one result in packetbeat ERROR as above.
18135-18144_plrt_s1.pcap is almost same as last one, but packetbeat can process it correctly.
39212-39227_ooo_s0_3req.pcap indicates that packetbeat can process out-of-order issue correctly.

so, my conclusion now is: when network is bad and redis response is big enough, something wrong might happen depending on splitting position from raw data.

packetbeat stops running, because all packets have been send to the protcol analyzers.

It's right, I get it.

steffens · February 9, 2017, 3:13pm

what means big enough? Currently packetbeat internally drops a stream if the active message exceeds 10MB I think.

moooofly · February 10, 2017, 2:31am

"big enough" means that the raw data in redis response should be split into multiple segmented packets (i.e. [PSH, ACK] segment in 37642-37651_plrt_s0_pbErr.pcap). Sorry about my description.

system · March 10, 2017, 2:31am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Packet loss while decoding REDIS protocol data Beats packetbeat	9	1561	March 1, 2018
Packetbeat cannot capture redis command CONFIG GET Beats packetbeat	3	588	December 29, 2019
Redis packet data is loss Beats packetbeat	1	368	September 19, 2019
MongoDB Packetbeat frequent crashing - syslog shows frequent errors Beats packetbeat	3	1828	July 5, 2017
Packet decode failed Beats packetbeat	2	1169	October 5, 2017

Packetbeat - “ERR Failed to read integer reply: Expected digit”

Snapshot in my test:

Related topics