Custom Analyzer [my_analyzer] failed to find tokenizer under name [whitespace]

Hello Team,

I'm using elasticsearch version 6.7.2. While testing with elasticsearch test framework, getting following error.

java.lang.IllegalArgumentException: Custom Analyzer [my_analyzer] failed to find tokenizer under name [whitespace]
    at __randomizedtesting.SeedInfo.seed([15944A0400B02A8:3E5900C2E7F115DD]:0)
    at org.elasticsearch.index.analysis.CustomAnalyzerProvider.build(CustomAnalyzerProvider.java:58)
    at org.elasticsearch.index.analysis.AnalysisRegistry.processAnalyzerFactory(AnalysisRegistry.java:553)
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:477)
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:167)
    at org.elasticsearch.index.IndexService.<init>(IndexService.java:164)
    at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:402)
    at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:526)
    at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:480)

Kindly suggest.

1 Like

can you share the request causing this?

Thanks for your quick response.

Currently using elasticsearch version 6.7.2. Using following setup in test class.

@Override
@Before
public void setUp() throws Exception {
    super.setUp();

    final Settings.Builder settings = Settings.builder();
    settings.put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 0).put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)
            .put("index.similarity.simpay.type", SimPayPlugin.NAME)
            .put("analysis.analyzer.my_analyzer.type", "custom")
            .put("analysis.analyzer.my_analyzer.tokenizer", "whitespace")
            .putList("analysis.analyzer.my_analyzer.filter", "lowercase", MyAnalyzerTokenFilterPlugin.NAME);

    prepareCreate(INDEX).setSettings(settings)
            .addMapping(INDEX_TYPE, jsonBuilder().startObject().startObject("properties")
                    //
                    .startObject("content").field("type", "text").field("similarity", "simpay")
                    .field("analyzer", "my_analyzer").endObject()
                    //
                    .endObject().endObject())
            .get();
    ensureGreen();

    client().prepareIndex(INDEX, INDEX_TYPE, "1")
            .setSource("content", "foo:2.3 bar:2.5 baz foo:3.7", "material", "Wafers:7.3 ICs:0.02 Chips:5.8").get();

    refresh();
}

while running the test it fails and displays the error.

@spinscale Any clue for me? :thinking:
Thanks in advance. :slightly_smiling_face:

can you share the HTTP request that is about to be executed? You can increase the logging of the HTTP Client if you need to so the data gets dumped. See https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.5/java-rest-low-usage-logging.html

Hello @spinscale,

Test failed while creating test index. No http request printed in log file. :frowning:

I doubt with test framework jar 6.7.2 does not register "whitespace" tokenizer. :thinking:
The same request runs properly via kibana with es cluster 6.7.2.

Additionally, this test was working on elasticsearch 6.2.2. I'm just upgrading the elasticsearch version and test stopped working.

[2020-02-05T11:53:29,400][TRACE][o.e.t.TaskManager        ] [numberOfShards] register 34 [transport] [indices:admin/create] []
[2020-02-05T11:53:29,400][TRACE][o.e.c.s.MasterService    ] [node_s0] will process [create-index [test], cause [api]]
[2020-02-05T11:53:29,400][DEBUG][o.e.c.s.MasterService    ] [node_s0] processing [create-index [test], cause [api]]: execute
[2020-02-05T11:53:29,402][TRACE][o.e.i.IndexSettings      ] [node_s0] [test] using [tiered] merge mergePolicy with expunge_deletes_allowed[10.0], floor_segment[2mb], max_merge_at_once[10], max_merge_at_once_explicit[30], max_merged_segment[5gb], segments_per_tier[10.0], deletes_pct_allowed[33.0]
[2020-02-05T11:53:29,403][DEBUG][o.e.i.IndicesService     ] [node_s0] creating Index [[test/zrkt0RkjRIGyy6gTnGhQhw]], shards [1]/[0] - reason [create index]
[2020-02-05T11:53:29,403][DEBUG][o.e.i.c.q.DisabledQueryCache] [node_s0] [test] Using no query cache
[2020-02-05T11:53:29,405][TRACE][o.e.c.s.MasterService    ] [node_s0] failed to execute cluster state update in [5ms], state:
version [6], source [create-index [test], cause [api]]
nodes: 
   {node_s0}{Qdn2klWERWuMGnfFpiQYVg}{4sIWVhVuRk2SB2FtIhjAhg}{127.0.0.1}{127.0.0.1:63676}, local, master
   {node_s1}{dV9AksWfQqew_k852LizaQ}{U5nUx8VJTZmUxzBIIO7qCg}{127.0.0.1}{127.0.0.1:63675}
routing_table (version 1):
routing_nodes:
-----node_id[dV9AksWfQqew_k852LizaQ][V]
-----node_id[Qdn2klWERWuMGnfFpiQYVg][V]
---- unassigned

java.lang.IllegalArgumentException: Custom Analyzer [my_analyzer] failed to find tokenizer under name [whitespace]
    at org.elasticsearch.index.analysis.CustomAnalyzerProvider.build(CustomAnalyzerProvider.java:58) ~[elasticsearch-6.7.2.jar:6.7.2]
	at org.elasticsearch.index.analysis.AnalysisRegistry.processAnalyzerFactory(AnalysisRegistry.java:553) ~[elasticsearch-6.7.2.jar:6.7.2]
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:477) ~[elasticsearch-6.7.2.jar:6.7.2]
    at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:167) ~[elasticsearch-6.7.2.jar:6.7.2]
    at org.elasticsearch.index.IndexService.<init>(IndexService.java:164) ~[elasticsearch-6.7.2.jar:6.7.2]

you can enable logging in the client, to have the request printed out, that would be really helpful in this case - also help to check differences between those two versions.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.