Elastic Serverless Forwarder for AWS adding reserved _id field when sending to logstash

stabbotco1 · August 3, 2023, 9:50pm

Hi All!
I am new to ES, so apologies in advance if I mis-state some things.

We are looking to use the ES Serverless Forwarder for AWS (Elastic Serverless Forwarder for AWS | Elastic Serverless Forwarder Guide | Elastic) to send data to logstash before sending it on to our self hosted ES cluster.

The initial simple setup is using an emitting lambda that sends a json log event every minute to the logs, which the forwarder gets subscribed to as an event from CW when the event is emitted into the lambdas CW logs.

sample event:

{
    "timestamp": "2023-08-03T21:39:07.179193",
    "random_field": "constant string value here",
    "aws_request_id": "98f1d8fb-3361-474f-b525-10c6513edb36"
}

There are no filters, the event is getting triggered and sent to logstash, which returns a 200.

The logstash logs are showing a 400 with inability to create the index because the log event is containing a reserved field _id.

error log snippet:
{"type":"mapper_parsing_exception","reason":"failed to parse field [_id] of type [_id] in document with id 'jJdVvIkBlVXM_z6z4Rmr'. Preview of field's value: '1691081882600-77a39f7173d6b4b6455fd7ae9f2fb147afd919365019f91f9bce160b1db21b100f45cc91f0c04489020b2725e53bacad-000000000000'","caused_by":{"type":"mapper_parsing_exception","reason":"Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters."}}}}}.

We are trying this with es and logstash version 7.16.2 - which should be supported per the docs.

I understand _id is reserved, but am surprised the forwarder is sending it as part of the document to be indexed, especially since the _id field is not part of the meta data of the original event

I have seen suggestions on filtering out the field from the AWS CW subscription, as well as removing or renaming the field in logstash, but I am surprised this would be needed at all since the field is not present in the source, and I would not expect the ES addon to include this field in a minimal implementation with no mapping.

Any thoughts appreciated, and thank you all for reading!

stephenb · August 3, 2023, 10:32pm

@jsoriano any thoughts?

@stabbotco1

Can you share your logstash configuration please?

leandrojmp · August 4, 2023, 3:19am

I do not use the ESF, but looking for it the _id field is indeed created by the forwarder as this github issue makes clear.

For what I understood it is used to avoid duplicates in Elasticsearch.

The documentation on how the Logstash pipeline for the ESF should look like is non-existent, but I think that if you add the following line in your elasticsearch output in your Logstash configuration it should work:

document_id => "%{_id}"

This will tell Elasticsearch to use the _id field as the id of the document and should avoid the mapping error that you are getting.

stabbotco1 · August 4, 2023, 12:55pm

Thank you! All the information was spot on and very helpful!

system · September 1, 2023, 12:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Warning message Logstash	3	305	April 11, 2018
[2020-05-22T09:04:47,691][WARN ][logstash.outputs.amazonelasticsearch] Could not index event to Elasticsearch Logstash	2	327	June 20, 2020
Could not index event to Elasticsearch Logstash	2	343	May 31, 2021
Log source with different field type - mapper_parsing_exception Elasticsearch	1	392	March 2, 2020
AmazonLogstashPlugin to AmazonElasticsearch documentId question Logstash	4	1458	July 6, 2017

Elastic Serverless Forwarder for AWS adding reserved _id field when sending to logstash

Related topics