Query String Query for case insensitive matching using ES-Spark connector


(Pramod Sripada) #1

Hi, I am using ES-Spark connector and trying to push filters down to ES, so that Spark would have exact data for processing .

I am trying to perform a case insensitive value match on a column value but unable to perform it using a normal filter (pushed down as a wildcard) because it is strictly matching the case. I have read on Stackoverflow that to perform case insensitive matching we need to use a query string. How can we specify a query string query in Spark DataFrame operations.

Current code for filter

df = spark.read.format("es").load("es_index").filter("business_name LIKE '%Walmart%')

which would push the filters down to the ES like

               {
               "wildcard":{
                  "business_name":"*Walmart*"
               }

The wildcard is case sensitive and will not match for ex: walmart. Could you please let me know how this can be achieved with the ES-Spark connector.

Thanks


(James Baiera) #2

One way you could do this is use a custom analyzer for the fields you are searching for: How to do case insensitive search on terms?


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.