Delete by query API invalid type name exception

I am using the high level rest client (6.6.0) and an embedded node (for system testing my code). I know the embedded node is not supported but this is not a "live" system. It is a node that is spun up in my system tests then destroyed at the end.

I can start the node, create an index and index a number of documents.

I then try and run a "delete_by_query" command (using the low level client as the high level client is adding unsupported query parameters).

A CURL example of my request is

curl -X POST "http://localhost:9201/my-index/_delete_by_query" -H 'Content-Type: application/json' -d ' {"query":{"match_all":{}}} '

I get an error response with the exception as "invalid_type_name_exception" and the reason being "Document mapping type cannot start with '_' found: [_delete_by_query]"

Interesting if I change the CURL url to "http://localhost:9201/my-index/doc/_delete_by_query" where "doc" is my type, then the request runs but creates a document in the index.

This seems to be the same as this issue https://discuss.elastic.co/t/delete-by-query-error-invalid-type-name-exception/119629 but I didn't think I need to install a plugin for "delete_by_query"?

Can anyone help me with where I'm going wrong please?

So on doing some further research I can see thats its most likely that delete_by_query is unavailable in the embedded node. Can anyone confirm this?

Its a shame the embedded node is no longer supported, it makes system testing your code much more difficult.

It's because delete by query is part of a module.

As you know you can not run elasticsearch embedded as this blog post says.

Note that to run integration tests (not unit tests) you would probably prefer running that in something close to a production environment, like a real elasticsearch server instance. I shared some ideas about integration testing in this thread: In memory testing with RestHighLevelClient

Also this sample project shows how to use Elasticsearch Test Classes:

Hi David

I will take a look further into the elasticsearch test cases as that looks interesting.

I like to write integration tests around an embedded client in my code to give me a level of assurance before testing against other environments. It gives quicker feedback IMO.

Thanks for your reply and the information

I updated the content of In memory testing with RestHighLevelClient...

I'm now using myself Docker from maven as shown here:

So i set up my code to use the Elasticsearch test classes and the nodes that they fire up. However i still get the same errors when running delete by query.

Is this the same problem as when using an embedded node and the module is not loaded?

Yes. Probably the same problem.
As I said for integration tests I prefer to run a real node with all features.

Such a shame this isn't available...

What i would find really useful about having an embedded node is i can write a system level test for my code (using junit for example) which tests my code against a lifelike system without having to move out of my IDE. Then I can TDD my code to make that test pass, using all the features offered within the IDE (debugging etc). Once that passes i can then move to further integration testing using other environments.

Is this pattern possible using the docker for maven solution?

That's what I'm doing and what I proposed in my previous answer.
But you can't debug elasticsearch itself. Which I don't think you need. I never needed to debug a server.

I can see that would work, but to me it brings in an external dependency on Docker into my development process, and I'm running on Windows which makes things more complicated.

Also if you were pushing those tests through a CI pipepline your build agents would then also need Docker.

Is there anyway I can identify the module that provides delete_by_query and load it? Could I just find the plugin version of it and load that?

Thanks

I think that jars for modules are not published anywhere so you would need to manually build them, add to your repository manager and use.

Speaking of CI, you can also install Docker on the machine where your ci is running, have an elasticsearch service running somewhere in your network or use an external service like cloud.elastic.co.

This is what I tried to showcase there:

(There are multiple branches - look at the last one. Note that the code is a bit old).

I had a break through! By adding the reindex plugin to the node configuration this then makes the delete_by_query REST call work.

1 Like

Oh I see. As it might be required by the High Level Rest Client for the Reindex API, it is also published.
I don't know TBH how long it will stay published. If the HLClient only uses at some point new classes for reindex request and response objects, I think it will not be published anymore. But that's just my thoughts.

I'd not rely on this for the future.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.