Beats: Pure go packet sniffer - Final Year Project

@steffens @Nicolas_Ruflin
Hi everyone,

I'm a Golang programmer, and a final year Computer Science student at the Federal University of Technology, Akure, Nigeria.

I would like to work on Beats for my Final Year project, specifically the Pure go packet sniffer. This was listed as a project idea for GSoC 2019, but Elastic was not accepted so I guess the project can be handled outside of GSoC.

I would like to know if the scope of the proposed Golang implementation is supposed to match that of libpcap, or if we only need to implement some of the features. I'd also like to understand the motivation behind the idea, and possible challenges I will face during implementation.

I would also like to create a tracking issue on Github, where we can be able to discuss implementation details.

Thanks.

There is indeed some overlap with libpcap, but the project idea is not meant as a go-native replacement for all libpcap.

The idea of the project is to concentrate on implementing a platform independent go native packet sniffing library only. With support for multiple backends/implementations. Yet we're not required to provide all implementations from day 0.
The drawbacks we're currently seeing with go-packet and libpcap are:

  • based on CGO, Therefor CGO function calling overhead for every single package
  • additional copies
  • not really possible to work with multiple receive queues (NIC RSS)

Therefore it would be nice to have a package with common interfaces/definitions, that ease the integration with sniffing methods that support zero copy, and multiple receive queues/workers for parallelizing actual packet processing. Yet allow applications to be written on top of these interfaces, in independence of the actual integration to be used.

Rescheduling packets in software takes some extra performance hit, which might effect throughput in bad ways. Therefore it is mandatory to make use of hardware backed receive queues.
Some projects/sniffers to integrate with (or draw ideas from):

  • libpcap
  • AF_Packet (very important: fanout support)
  • DPDK (minus the memory handling, it's nice for generic applications, but too inefficient for very simple packet sniffing cases)
  • Napatech
  • netmap (also has ports to Linux and Windows)
  • pf_ring
  • Windows support via NPcap (or similar drivers)

While we can draw inspiration from these projects, and design types/interfaces such that these projects can be integrated in the future, I'd say it makes sense to target only one sniffer type per OS for starters (here we can have a look at the libpcap implementations themselves):

  • linux: af_packet
  • windows: npcap
  • freebsd/darwin: not sure.

Passive sniffers are often subject to packet drops if the application is not fast enough. Therefore some metrics support is a must, in order to monitor how the sniffer/application is doing. Depending on the sniffer type, this can be provided to the user by the library or external tools. This means we very much allow for differences between sniffer types to exist in their public interface, yet we require them to follow a narrowed down set of protocol/interfaces for use with applications/libraries that will be based on top of the library.

Packet filtering is as much important. Pcap filter support is not exactly a must, but nice to have. Pcap filters are compiled to BPF by the libpcap library. I think it's valid to still use libpcap for this use-case. Filtering capabilities depend on the API/NIC to be used. Some require the configuration to be made via separate tools.

Having support for abstraction and multiple receive queues we can also think about 'merging' sniffers/interfaces. Or starting multiple workers per interface and it's receive queues so to say. For example on linux libpcap supports the any device. This has the draw back that it only works on linux + it combines the packages from multiple devices into one common stream + creates a so called LLC header, which requires some special handling (to be expected if some device is no pure ethernet device). Instead it would be nice if we could collect all known devices (including callback) and start workers per device if the any device is configured.

Native support for a pcap and pcapng files would be nice to have as well. Here with some more support then just blindly reading and forwarding packages as is (e.g. by wrapping the sniffer). Some applications are very sensitive to timing/availabilty. Normally one is well served by writing an application to be independent of the machines clock (e.g. have internal timer + timeouts based on packets timestamps and deltas between timestamps), but this is not always the case unfortunately (some library to support this would be great). For simulating use-cases via pcaps (e.g. debugging traces from other users) it's nice if a file based sniffer can introduce delays based on packet timestamps as well.

Feel free to ask if you have any more questions or if things are not fully clear to you.

I would also like to create a tracking issue on Github, where we can be able to discuss implementation details.

We rather use github for tracking actual issues. For the time being I think we should continue discussing here.

1 Like

Thanks for your reply, @steffens.
Now I understand better what the issues are with the current implementation.
I will do some reading, and I look forward to working with the community as soon as possible.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.