Elasticsearch stopped allocating shards due to missing kuromoji_part_of_speech filter

phoerious · November 10, 2020, 8:48am

Hi,
We have a web index containing Japanese documents, which are using analyzers from the analysis-icu and analysis-kuromoji plugins. Our Elasticsearch cluster is deployed via ECK and pulls the plugins inside the init container.

This has worked fine so far. Recently however, the nodes suddenly stopped allocating the shards with the error:

shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-11-10T08:07:20.393Z], failed_attempts[5], failed_nodes[[DZkSjlD3SSyHPmrFEL1SzQ, pK3-4-S5TtStIwG1YArLag, 2_EeQbAlS0WoANaZT7H9oQ]], delayed=false, details[failed shard on node [pK3-4-S5TtStIwG1YArLag]: failed to create index, failure IllegalArgumentException[Unknown filter type [kuromoji_part_of_speech] for [ja_pos_filter]]], allocation_status[no_attempt]]]

ja_pos_filter looks like this:

"ja_pos_filter" : {
  "type" : "kuromoji_part_of_speech",
  "stoptags" : [
    "\\u52a9\\u8a5e-\\u683c\\u52a9\\u8a5e-\\u4e00\\u822c",
    "\\u52a9\\u8a5e-\\u7d42\\u52a9\\u8a5e"
  ]
}

I checked and the analysis-kuromoji plugin is correctly installed and shows up in _cat/plugins. The Elasticsearch version is 7.7.1. I tried upgrading one of the affected nodes to 7.9.3, but it is still having the same issue. We did not cache the downloaded plugins, so I wonder if a more recent version of the plugin changed something that would cause this incompatibility. There are no errors in the node's runtime log.

At this point, I am afraid of restarting more nodes, since that would eventually result in the whole index being unavailable. Any idea what the problem might be?

phoerious · November 11, 2020, 3:36pm

I checked the contents of the 7.9.3 kuromoji plugin and the AnalysisKuromojiPlugin class inside AnalysisKuromojiPlugin.class file does add KuromojiPartOfSpeechFilterFactory to the token filter map with the key kuromoji_part_of_speech. I am on a total loss here, why it isn't working.

phoerious · November 17, 2020, 1:49pm

The issue was actually quite trivial to solve in the end. The reporting nodes did indeed have the needed plugins, but a handful of other nodes failed to download them properly and started anyway. It was pretty much random when a shard would be allocated and when it would throw an error and the node reporting the error was never the node that was actually missing the plugin. But after ensuring that ALL nodes had the needed plugins without exception, the issue resolved itself immediately.

Adding set -e at the start of the init container install script should ensure this doesn't happen again.

system · December 15, 2020, 1:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES231 fail due to below Kuromoji error Elasticsearch	4	2226	July 5, 2017
Missing analyzer plugin causes index creation failure Elasticsearch	4	581	July 6, 2017
[ANN] Elasticsearch Japanese (kuromoji) Analysis plugin 2.4.3 released Elasticsearch	2	392	July 6, 2017
Kuromoji plugin error Elasticsearch	2	712	July 6, 2017
Kuromoji analyzer error Elasticsearch	2	962	July 6, 2017

Elasticsearch stopped allocating shards due to missing kuromoji_part_of_speech filter

Related topics