Hi Logstash Experts -
This is my first time dealing with Logstash, so I'm not quite sure why the logs are being formatted like this, whether this is expected/normal, or how to handle them.
I'm hitting an issue with logs from Artifactory -> Fluent Bit -> Logstash. The main issue is that the log files are too big to pass to the Azure Sentinel plugin - I'm getting the error
[ERROR][logstash.outputs.microsoftsentineloutput][artifactory][...] Received document above the max allowed size - dropping the document [document size: 3920975, max allowed size: 1036576
Is there some way to split the logs into smaller files? Is this possible with Logstash, or does it need to be done before the logs are sent from Fluent Bit?
My config is very simple:
input {
tcp {
port => 8084
}
}
output {
s3 {
region => "ap-northeast-1"
bucket => "MY_BUCKET"
prefix => "artifactory-logs/%{+YYYY}/%{+MM}/%{+dd}"
time_file => 5
additional_settings => {
"force_path_style" => true
"follow_redirects" => false
}
codec => json
}
file {
path => "/var/log/artifactory_test.log"
write_behavior => "overwrite"
}
microsoft-sentinel-logstash-output-plugin {
client_app_Id => MY_ID
client_app_secret => MY_SECRET
tenant_id => MY_TENANT_ID
data_collection_endpoint => MY_ENDPOINT
dcr_immutable_id => MY_DCR
dcr_stream_name => MY_STREAM
}
}
Each log file contains tons of entries, and they're grouped together in an odd way, with square brackets and no delimiters. Each pair of square brackets contains an arbitrary number of log entries separated by curly braces and commas. All quotation marks are escaped.
[{"message":"dummy data","purpose":"testing","log_server":"splunk"},{"message":"dummy data","purpose":"testing","log_server":"splunk"},{"message":"dummy data","purpose":"testing","log_server":"splunk"},{"message":"dummy data","purpose":"testing","log_server":"splunk"}]
[{"message":"dummy data","purpose":"testing","log_server":"splunk"}]