Thank you for the work you put into this. It's a step in the right direction but I was still getting errors with that config. I suspect its because my raw data looks like this:
{"remote_addr": "127.0.0.1","time_local": "27/Jan/2023:11:08:37 -0800","request": "POST /abcd HTTP/1.1", "request_method": "POST","status": "200","user_agent": "curl/7.84.0","headers": {\x22Host\x22:\x22localhost\x22,\x22User-Agent\x22:\x22curl/7.84.0\x22,\x22Accept\x22:\x22*/*\x22,\x22Content-Length\x22:\x2221\x22,\x22Content-Type\x22:\x22application/x-www-form-urlencoded\x22},"http_ssl_ja3": "771,4866-4867-4865-49196-49200-159-52393-52392-52394-49195-49199-158-49188-49192-107-49187-49191-103-49162-49172-57-49161-49171-51-157-156-61-60-53-47-255,0-11-10-13172-16-22-23-49-13-43-45-51,29-23-30-25-24,0-1-2", "http_ssl_ja3_hash": "ba730f97dcd1122e74e65411e68f1b40","request_body": "{\x22test_json\x22: \x22test\x22}"}
I had cleaned the original post up to make it more readable. I don't use newlines in the nginx log format. It also doesn't seem to be applying the mutate. I'm still seeing \x22
in the trace, so I'm not sure why it wouldn't be applying.
[WARN ] 2023-01-27 11:08:40.720 [[main]>worker16] json - Error parsing json {:source=>"message", :raw=>"{\"remote_addr\": \"127.0.0.1\",\"time_local\": \"27/Jan/2023:11:08:37 -0800\",\"request\": \"POST /abcd HTTP/1.1\", \"request_method\": \"POST\",\"status\": \"200\",\"user_agent\": \"curl/7.84.0\",\"headers\": {\\x22Host\\x22:\\x22localhost\\x22,\\x22User-Agent\\x22:\\x22curl/7.84.0\\x22,\\x22Accept\\x22:\\x22*/*\\x22,\\x22Content-Length\\x22:\\x2221\\x22,\\x22Content-Type\\x22:\\x22application/x-www-form-urlencoded\\x22},\"http_ssl_ja3\": \"771,4866-4867-4865-49196-49200-159-52393-52392-52394-49195-49199-158-49188-49192-107-49187-49191-103-49162-49172-57-49161-49171-51-157-156-61-60-53-47-255,0-11-10-13172-16-22-23-49-13-43-45-51,29-23-30-25-24,0-1-2\", \"http_ssl_ja3_hash\": \"ba730f97dcd1122e74e65411e68f1b40\",\"request_body\": \"{\\x22test_json\\x22: \\x22test\\x22}\"}, \"headers\": {\"Host\":\"localhost\",\"User-Agent\":\"curl/7.84.0\",\"Accept\":\"*/*\",\"Content-Length\":\"21\",\"Content-Type\":\"application/x-www-form-urlencoded\"} ,\"http_ssl_ja3\": \"771,4866-4867-4865-49196-49200-159-52393-52392-52394-49195-49199-158-49188-49192-107-49187-49191-103-49162-49172-57-49161-49171-51-157-156-61-60-53-47-255,0-11-10-13172-16-22-23-49-13-43-45-51,29-23-30-25-24,0-1-2\", \"http_ssl_ja3_hash\": \"ba730f97dcd1122e74e65411e68f1b40\",\"request_body\": \"{\\\"test_json\\\": \\\"test\\\"}\"}", :exception=>#<LogStash::Json::ParserError: Unexpected character ('\' (code 92)): was expecting double-quote to start field name
at [Source: (byte[])"{"remote_addr": "127.0.0.1","time_local": "27/Jan/2023:11:08:37 -0800","request": "POST /abcd HTTP/1.1", "request_method": "POST","status": "200","user_agent": "curl/7.84.0","headers": {\x22Host\x22:\x22localhost\x22,\x22User-Agent\x22:\x22curl/7.84.0\x22,\x22Accept\x22:\x22*/*\x22,\x22Content-Length\x22:\x2221\x22,\x22Content-Type\x22:\x22application/x-www-form-urlencoded\x22},"http_ssl_ja3": "771,4866-4867-4865-49196-49200-159-52393-52392-52394-49195-49199-158-49188-49192-107-49187-49191-103-4"[truncated 705 bytes]; line: 1, column: 188]>}
So if I understand this correctly,
mutate { gsub => [ "[@metadata][part2]", '\A.*("headers":\s*{[^}]*}).*\z', "\1" ] }
mutate { gsub => [ "[@metadata][part2]", "\\x22", '"' ] }
Is basically selecting the regex match at group index 1(\1
, highlighted in green below) and applying the \x22 => "
mutate to that specifically?
Then this line merges all the regex groups back together?
mutate { replace => { "message" => "%{[@metadata][part1]}, %{[@metadata][part2]} %{[@metadata][part3]}" } }