packetbeat.protocols.http:
# Configure the ports where to listen for HTTP traffic. You can disable
# the HTTP protocol by commenting out the list of ports.
ports: [80, 8080, 8000, 5000, 8002]
send_request: true
send_response: true
include_body_for: ['image/webp','application/x-javascript','*/*','application/octet-stream','text/xml','application/xml','application/xhtml+xml','text/plain','text/html','application/json','text/javascript','application/javascript','application/x-www-form-urlencoded']
split_cookie: true
send_all_headers: true
The HTTP module compares the include_body_for with the contents-type strings (must be exact match). I don't see the Contents-Type header in your sample.
I am able to see the Content-Body only for Content-Type='text/html' which are always HTTP 401 messages.
Currently, in my project, I need to capture the SOAP messages (which are in XML format by the way) in HTTP 200 messages.
But, the problem is there is no Content-Type for these type of packets. I am unable to see the Content-Type even in Wireshark tool. But, I am able to see a Data field which have the SOAP requests and responses. Why can't I see the same in Packetbeat?
Here is a screenshot of a HTTP 200 packet in the Wireshark tool. (Captured as a .pcap file using tcpdump command under the same environment):
Why can't I get the request/response body irrespective of whether I use include_body_for or not? I see that Packetbeat needs this field to distinguish between the type of body in these HTTP packets.
There must be something that I am missing out.
OR
There must be an issue or a limitation in Packetbeat itself.
It is kind of a limitation in packetbeat itself (on purpose though). Packetbeat requires the presence of Content-Type + it requires the type to match include_body_for. Without contents type there is no really guarantee on the actual contents. It's just some bytes. Packetbeat will not try to index just something potentially being binary.
The easiest fix (I hope) is to have your application properly set the contents type. I understand it's not always possible to modify an application/library. Feel free to open an enhancement request or a PR introducing a new setting to include the body if the contents type is missing (related code path is here)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.