Is field type "wildcard" supported in elasticsearch-spark?


I am currently doing some tests to see if I can use Apache Spark and the elasticsearch-spark (elasticsearch-hadoop) to analyse data that I have in elasticsearch.

I am testing using the following versions in a docker-compose setup:

Spark: 3.1.2
Elasticsearch: 7.12.1

For the spark job its a java program using dependencies:

  • spark-core_2.12:3.1.2
  • elasticsearch-spark-30_2.12:7.13.2

I use spark-submit to send the application to spark.

In elasticsearch I have one (1) index where some different string fields are mapped. I have tested text, keyword and wildcard, however I am most interested in the wildcard variant because of its properties for efficient wildcard searches.

 "fielda"  : { "type" : "keyword"  },
 "fieldb"  : { "type" : "wildcard" },

Fields of type wildcard simply does not seem to exist when accessing the data through the JavaPairRDD in the spark job.

When I run the spark job and filters+counts all documents where fieldb is some value I get 0 hits although I know it should give 2 hits.
I have tried using the same code but for fielda which is of type keyword and then the 2 hits are found.

Also, I have made a simple .first() to just get hold of first available indexed document and when I print all the entries in the associated key-value map any field that is of type wildcard just does not appear.

I have compared these results with the response when I use rest API to directly search elasticsearch and there I can see and search both keyword and wildcard fields, no problem.

Some questions of mine:

  1. Is the wildcard field type not supported? And if not, anybody know if this feature is planned to be added soon in elasticsearch-hadoop/elasticsearch-spark?
  2. Is there any way to get wildcard fields to work with elasticsearch-spark?

Answer was had on github (wildcard field type is not yet supported):

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.