Elasticsearch 7.9.3 - exceptions of data insertions

Lior_Yakobov · November 30, 2020, 10:34am

Hello,
Recently I have upgraded our cluster from 7.5.2 to 7.9.3, and I noticed that there are some exceptions which were available in the elasticsearch log before the upgrade but now are gone.
Some examples:

[2020-11-22T11:34:05,212][DEBUG][o.e.a.b.TransportShardBulkAction] [aws-elkdb17] [staging-gwstats-silent-2020.11.22][0] failed to execute bulk item (index) index {[index-a-2020.11.22][_doc][0zW773UB1LS_CBEphClo], source[n/a, actual length: [4.4kb], max length: 2kb]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [ips_last_modified_release] of type [long] in document with id '0zW773UB1LS_CBEphClo'. Preview of field's value: 'MAPP2008'

[2020-11-22T03:21:09,005][DEBUG][o.e.a.b.TransportShardBulkAction] [aws-elkdb17] [mail-sec-intelligence-2020.11][1] failed to execute bulk item (index) index {[index-b-2020.11][_doc][h8347XUB1LS_CBEpOMeH], source[n/a, actual length: [76kb], max length: 2kb]}
java.lang.IllegalArgumentException: Document contains at least one immense term in field="mail.links.image_source" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[100, 97, 116, 97, 58, 105, 109, 97, 103, 101, 47, 106, 112, 101, 103, 59, 98, 97, 115, 101, 54, 52, 44, 47, 57, 106, 47, 52, 65, 65]...', original message: bytes can be at most 32766 in length; got 56087

Is there a reason that now these errors are not visible in the log file?
Is it configurable and can be returned once more?
This is important for us as we used to monitor these errors and inform the data owners that they have issues with their data flow.

Thanks in advance,
Lior

warkolm · December 2, 2020, 12:54am

It looks like you are using time based indices, so it's likely the new index has had a mapping change and adapted to the data.

Can you compare the mappings from that index and a newer one?

Lior_Yakobov · December 2, 2020, 10:00am

Hey @warkolm,
what you describe can be an option, but I validated that this is not the case, as I do see some errors in the Logstash side, which until version 7.9 were correlated with matching exceptions in the data nodes.
Examples for recent exceptions I see in the Logstash

So I went to check the elasticsearch logs for the nodes which holds these indices, but couldn't find the MapperParsingException.

Thanks,
Lior

Lior_Yakobov · December 8, 2020, 8:36am

Hey @warkolm,

Did you managed to see my previous message?
Thanks,
Lior

warkolm · December 8, 2020, 10:02pm

Usually if there's no errors in the logs then there's nothing causing any error.

Lior_Yakobov · December 10, 2020, 9:26am

Hey @warkolm,
usually I would agree, but the thing is that before the version upgrade, the exceptions in Logstash side were correlated with errors in the ElasticSearch nodes.

For example now, there are these errors in Logstash:
[2020-12-10 11:19:34] [sba2_telemetry_da-2020-50] [mapper_parsing_exception] failed to parse field [HardwareInfo.PhysicalDisks] of type [keyword] in document with id 'ZZHRS3YBO_pGjNKuYtRC'. Preview of field's value: '{DiskType=SSD, BusType=NVMe, DiskSize=238}'", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:1203

And in the ElasticSearch node which hold this index there are no indications about events which failed to be inserted.

I wonder if there is an ability to make these errors appear again in the ElasticSearch log, as we have a monitoring process for these errors which based on the ElasticSearch log file.
Achieving the same goal with the Logstash logs are more problematic as our Logstash processes running in AWS ECS rather than the cluster which is EC2 based.

Thanks,
Lior

DavidTurner · December 10, 2020, 10:11am

Yes, they were DEBUG messages and shouldn't have been there by default. You can get them back in the server logs if you want, but really you shouldn't be relying on DEBUG logging for anything since this logging level is not intended for end-users and can change at any time. It was debatable whether this was really a breaking change because of that, but we decided to err on the safe side and document it as such anyway. Any exceptions logged by this logger are also returned to the client that triggered the failing action, and our recommendation is that they should be handled entirely on the client.

You need to watch the errors reported by the client anyway since these will include cases where the request didn't make it to Elasticsearch for which there is therefore no corresponding entry in the Elasticsearch log.

Lior_Yakobov · December 10, 2020, 11:09am

Hey @DavidTurner,
thank you for the detailed answer.
By saying client side, in this case you mean the Logstash right?

I will try to see if I'm able to create some Grafana dashboards based on CloudWatch logs in order to monitor this.

Thanks,
Lior

DavidTurner · December 10, 2020, 11:22am

Yes, sorry, I didn't see that your later post indicated Logstash was the client.

system · January 7, 2021, 11:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Issue with custom json input data using logstash Elasticsearch	4	592	July 6, 2017
Mapper_parsing_exception for logs sample data Elasticsearch	1	433	November 1, 2018
MappingParserException. Document is not indexed. Buffer Limit? Elasticsearch	1	563	July 6, 2017
Corrupted ElasticSearch index? Elasticsearch	18	3587	July 6, 2017
Old ElasticSearch Data & Upgrading Major Versions? Elasticsearch	4	1235	July 5, 2017

Elasticsearch 7.9.3 - exceptions of data insertions

Related topics