Can Index Parsing Errors Impact node/cluster stability?

Russell_Day · June 15, 2018, 6:47am

Please refer to:

danielmitterdorfer · June 15, 2018, 7:19am

Hi,

hard to tell. I'd not expect increased heap usage due to that. However, if you're provoking a lot of exceptions the JIT compiler will optimize (that part of the code) differently which will likely lead to worse performance.

I don't know anything about your application but it is possible to get the current mapping of an index via the get mapping API.

Daniel

Russell_Day · June 15, 2018, 1:44pm

Thanks Daniel,

It seems unlikely to me as well however I just wanted to check. A common issue we are seeing is messages like this: timed out waiting for all nodes to process published state.

What happens when this is the case? Does the master try again or does it potentially remove nodes that did not ack the latest version from the cluster?

danielmitterdorfer · June 15, 2018, 2:46pm

Hi,

if publishing fails you will see a warning in the log but there is no retry on that level. However, the node(s) that failed to acknowledge publication within the timeout will receive the full cluster state when the next cluster state update is published.

Daniel

Russell_Day · June 15, 2018, 3:05pm

So the node(s) that did not ack the cluster state update within the timeout are still in service and can serve requests correct?

Russell_Day · June 15, 2018, 6:34pm

Also, I would like to clarify that our ingest service is the throwing the HighLevelRestClient errors. The data node logs show:
org.elasticsearch.index.mapper.MapperParsingException: failed to parse...

Can you clarify if a high number of these messages can lead to node health issues?

danielmitterdorfer · June 18, 2018, 11:50am

Hi,

Yes, usually the nodes are still in service (but it could always happen that a node just died within this time period).

The fact that a cluster state update takes more than half a minute indicates that these nodes are stressed though (many cluster state updates due to very frequent changes in the mapping?). I think it would make sense that you dig deeper why that is the case in your cluster.

Some pointers:

What is happening on hardware level (disk, memory, CPU, network and other resources)?
What are the affected nodes doing at that point? (hot_threads API, attach a profiler (if on a test system), take thread stack traces, etc.)

Daniel

system · July 16, 2018, 11:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Node loop crashing Elasticsearch	19	2463	June 12, 2020
Long period of querying failure during node timeout Elasticsearch	4	1044	May 15, 2020
Stability issues with elasticsearch cluster Elasticsearch	6	1428	July 6, 2017
Will frequent mapping exceptions slowdown my ingestion rate? Elasticsearch	3	347	March 18, 2019
Is this stacktrace a reason for cluster instability? Elasticsearch	3	428	July 6, 2017

Can Index Parsing Errors Impact node/cluster stability?

Related topics