Error with embedded server and tokenizer: edge_ngram

Hi,

I'm making integration testing with the RestHighLevelClient (ES v7.8).
I'm starting an embedded server with the hack described here:

When I try to create an index with this settings, I am facing an error.
Code:

String json = FileReader.jsonAsString(ElasticConstants.MAPPINGS, index);
CreateIndexRequest createIndex = new CreateIndexRequest(index).source(json, XContentType.JSON);
CreateIndexResponse result = client.indices().create(createIndex, RequestOptions.DEFAULT);
assertTrue(result.isAcknowledged());

Index:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "name_analyzer": {
          "tokenizer": "edge_ngram_tokenizer",
          "filter": [
            "ascii_folding",
            "lowercase",
            "strip_underscore"
          ]
        }
      },
      "filter": {
        "ascii_folding": {
          "type": "asciifolding",
          "preserve_original": true
        },
        "strip_underscore": {
          "type": "word_delimiter",
          "split_on_numerics": true,
          "split_on_case_change": false,
          "generate_word_parts": true,
          "generate_number_parts": false,
          "catenate_all": true,
          "preserve_original": true
        }
      },
      "tokenizer": {
        "edge_ngram_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 10,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    }
  },
  "mappings": {
    "dynamic_templates": [
      {
        "singleName": {
          "match": "single",
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "analyzer": "name_analyzer"
          }
        }
      },
      {
        "fullName": {
          "match": "full",
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "analyzer": "name_analyzer"
          }
        }
      }
    ],
    "properties": {
      "id": {
        "type": "text"
      },
      "code": {
        "type": "keyword",
        "store": true
      },
      "name": {
        "enabled": true,
        "type": "object"
      },
      "location": {
        "type": "geo_point"
      },
      "parents": {
        "type": "object"
      },
      "datasets": {
        "type": "object"
      },
      "alias": {
        "type": "text"
      },
      "sales": {
        "type": "object"
      },
      "type": {
        "type": "text"
      }
    }
  }
}

Error:

	ElasticsearchStatusException[Elasticsearch exception [type=illegal_argument_exception, reason=Unknown tokenizer type [edge_ngram] for [edge_ngram_tokenizer]]
]
	at __randomizedtesting.SeedInfo.seed([9E7C3A67BA48895C:ABE0CE1FF2AE5784]:0)
	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
	at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1897)
	at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1867)
	at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1624)
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1596)
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1563)
	at org.elasticsearch.client.IndicesClient.create(IndicesClient.java:139)
	at com.despegar.suggester.sync.test.integration.repository.jest.AbstractESRepositoryTest.setUpIndex(AbstractESRepositoryTest.java:81)
	at com.despegar.suggester.sync.test.integration.repository.jest.CityESRepositoryTest.setup(CityESRepositoryTest.java:31)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:972)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)
	Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://127.0.0.1:53103], URI [/cities?master_timeout=30s&timeout=30s], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Unknown tokenizer type [edge_ngram] for [edge_ngram_tokenizer]"}],"type":"illegal_argument_exception","reason":"Unknown tokenizer type [edge_ngram] for [edge_ngram_tokenizer]"},"status":400}
		at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:283)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:261)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235)
		at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1611)
		... 41 more

However, when I try to create the index manually, using a docker image of ES 7.8, the index is created ok. (PUT with exactly the same body)

Does anyone figure out what may be happening?

Hi @Emanuel_Velzi

the problem in your test looks to be that the common analysis plugin is not loaded during your test. You will need to add CommonAnalysisPlugin.class to the list of plugins returned by the test plugins list so that it becomes .

    @Override
    protected Collection<Class<? extends Plugin>> nodePlugins() {
        return List.of(Netty4Plugin.class, CommonAnalysisPlugin.class);
    }

or so. Then the missing tokeniser should load fine and your mapping should work.

You are right! Also I had to add this maven dependency:

<!-- https://mvnrepository.com/artifact/org.codelibs.elasticsearch.module/analysis-common -->
<dependency>
    <groupId>org.codelibs.elasticsearch.module</groupId>
    <artifactId>analysis-common</artifactId>
    <version>7.8.0</version>
</dependency>

It's working now, thanks!!

Can I ask you why this happens? I thought these plugins are part of the core, and should always be loaded.

No problem!

Sure. Even though we bundle core Elasticsearch releases to include this and other plugins/modules we do develop them in different build/Gradle modules for various reasons. That's why if you simply load the core server module of Elasticsearch in a test that will not give you all of the class path that is loaded by a complete ES release. Also, that's why we have the facility to manually specify what plugins to load in tests to get predictable behavior for these test nodes somewhat independent of the current class path.

Hope that helps

Thanks for replying.

In the previous version of my app (I'm updating the ES version to v7.8), I had this in my pom.xml:

<!-- required by elasticsearch-test-jar -->
	<dependency>
	    <groupId>org.elasticsearch</groupId>
	    <artifactId>elasticsearch</artifactId>
	    <version>2.3.4</version>
	    <type>test-jar</type>
	    <scope>test</scope>
	</dependency>

And I didn't have any problem like this.

However, Maven cannot load that test-jar for v7.8:

Could not transfer artifact org.elasticsearch:elasticsearch:jar:tests:7.8.0 from/to miami (http://mynexus/content/groups/public/): Transfer failed for http://mynexus/content/groups/public/org/elasticsearch/elasticsearch/7.8.0/elasticsearch-7.8.0-tests.jar

For that reason I've added this to my pom.xml:

    <dependency>
		<groupId>org.elasticsearch.test</groupId>
		<artifactId>framework</artifactId>
		<version>7.8.0</version>
		<scope>test</scope>
	</dependency>

And then I've incurred in this errors with the plugins. So, the question is:
are there any way to load everything at once? (like I was doing in the past)

No, I'm afraid there is no way to do that these days with the way the build is modularised.

Ok.. Thanks for all your help @Armin_Braun !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.