For the past couple days I have been trying to identify how to code for an AWS CloudWatch subscription to my ES domain in AWS. My subscription passes into a Lambda function (written in node.js) that POSTs to the endpoint.
I am pulling data from all different sources: EC2, S3, RDS, VPC, IAM, etc....
The issue I am running into has to do with mapping of fields. I am trying to use a single index which will be rotated weekly (i.e. cwl-2020-week36) as opposed to separate indices per source which will lead/has lead to issues before.
A field of the same name may appear as an integer in one document, but then a string in another.
Or a field might be a single value in one, and an array in another.
My question is: how should I resolve having different field values across the different CloudWatch Logs I have streaming in???
I have tried in my Lambda parsing fields as strings (an exhaustive approach that still leaves stragglers) and designing a dynamic index template that I had hoped would mediate conflicts between previously separated indices by each event source.
I know transformations and filtering can be done with grok and logstash, but I am uncertain how to implement logstash with the Elasticsearch Service in AWS (as opposed to hosting ELK on an EC2).
Please let me know of any ideas you may have, or if more information is needed.
Thanks all,