I am having trouble getting my data ingested by logstash... I think... but I'm just learning about how to do this.
Someone else stood up Cloud Custodian, and is dumping logs to an s3 bucket, and I am attempting to ingest them. The files are .log, but formatted multi-line json (pretty, not single line). I have read that this formatting makes things harder, but I am also being told that Cloud Custodian cannot be configured to do a single line file anyway. It is what it is. As an experiment, I tried to collapse the log file to a single line, and that did not help, so I am doing something else wrong anyway.
An example input file:
{
"policy": {
"name": "account-cloudtrail-enabled",
"resource": "account",
"description": "Checks to make sure CloudTrail is enabled on the account\nfor all regions.\n",
"filters": [
{
"type": "check-cloudtrail",
"global-events": false,
"multi-region": false,
"running": false,
"file-digest": false
}
]
},
"version": "0.9.13",
"execution": {
"id": "1ebc9860-6d1a-4e42-b809-0fad544479fe",
"start": 1638815388.1077602,
"end_time": 1638815388.935413,
"duration": 0.8276526927947998
},
"config": {
"region": "us-east-2",
"regions": [
"us-east-2"
],
"cache": "~/.cache/cloud-custodian.cache",
"profile": "CCAdmin",
"account_id": "353563186465",
"assume_role": null,
"external_id": null,
"log_group": null,
"tracer": null,
"metrics_enabled": null,
"metrics": null,
"output_dir": "s3://testcclog/custodian/",
"cache_period": 15,
"dryrun": false,
"authorization_file": null,
"subparser": "run",
"config": null,
"configs": [
"./policies/root_account-compliance.yml"
],
"policy_filters": [],
"resource_types": [],
"verbose": null,
"quiet": null,
"debug": false,
"skip_validation": false,
"command": "c7n.commands.run",
"vars": null
},
"sys-stats": {},
"api-stats": {
"iam.ListAccountAliases": 1,
"cloudtrail.DescribeTrails": 1
},
"metrics": [
{
"MetricName": "ResourceCount",
"Timestamp": "2021-12-06T11:29:48.934903",
"Value": 0,
"Unit": "Count"
},
{
"MetricName": "ResourceTime",
"Timestamp": "2021-12-06T11:29:48.934920",
"Value": 0.8265008926391602,
"Unit": "Seconds"
}
]
}
My conf file (altered to just input the one file for testing, and I did not include output here):
input {
file {
start_position => "beginning"
path => "/etc/logstash/sample/cctest1.log"
sincedb_path => "/dev/null"
}
}
filter {
json {
source => "message"
target => "cc-data"
}
mutate {
remove_field => ["@timestamp", "@version", "host"]
}
}
The result is "_jsonparsefailure":
"_source" : {
"path" : "/etc/logstash/sample/cctest1.log",
"tags" : [
"_jsonparsefailure"
],
"message" : " \"policy\": {"
and the logs show (in part):
[2021-12-14T17:39:47,760][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2021-12-14T17:39:47,804][DEBUG][filewatch.sincedbcollection][main][2590825d4319bd0d59fd3a3624e3664e1d556c1fcec631e70e1e6d0b3e6a891c] open: reading from /dev/null
[2021-12-14T17:39:48,004][DEBUG][filewatch.tailmode.handlers.grow][main][2590825d4319bd0d59fd3a3624e3664e1d556c1fcec631e70e1e6d0b3e6a891c] controlled_read get chunk
[2021-12-14T17:39:48,019][DEBUG][logstash.inputs.file ][main][2590825d4319bd0d59fd3a3624e3664e1d556c1fcec631e70e1e6d0b3e6a891c] Received line {:path=>"/etc/logstash/sample/cctest1.log", :text=>"{"}
[2021-12-14T17:39:48,050][DEBUG][logstash.codecs.plain ][main][2590825d4319bd0d59fd3a3624e3664e1d556c1fcec631e70e1e6d0b3e6a891c] config LogStash::Codecs::Plain/@id = "plain_25008cc7-f861-40bc-8915-c838d4b5579a"
and
[2021-12-14T17:39:48,333][WARN ][logstash.filters.json ][main][abcf413f55725bbbfa9f9906d732319ba13915c121bd7566c4ac806507f71b3e] Error parsing json {:source=>"message", :raw=>" \"name\": \"account-cloudtrail-enabled\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: (byte[])" "name": "account-cloudtrail-enabled","; line: 1, column: 12]>}
[2021-12-14T17:39:48,335][WARN ][logstash.filters.json ][main][abcf413f55725bbbfa9f9906d732319ba13915c121bd7566c4ac806507f71b3e] Error parsing json {:source=>"message", :raw=>" \"resource\": \"account\",", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: (byte[])" "resource": "account","; line: 1, column: 16]>}
[2021-12-14T17:39:48,337][DEBUG][logstash.filters.json ][main][abcf413f55725bbbfa9f9906d732319ba13915c121bd7566c4ac806507f71b3e] Running json filter {:event=>{"path"=>"/etc/logstash/sample/cctest1.log", "@version"=>"1", "@timestamp"=>2021-12-14T17:39:48.118Z, "message"=>" \"description\": \"Checks to make sure CloudTrail is enabled on the account\\nfor all regions.\\n\",", "host"=>"ip-172-31-29-221.us-east-2.compute.internal"}}
Why would it be complaining about a ":"? That is valid json, no?
Can I configure this to read to whole multi-line json object?
If I comment out the json filter, I get the 'message' seeing each line of the log file as text rather than json:
"_source" : {
"path" : "/etc/logstash/sample/cctest1.log",
"message" : " \"policy\": {"