Handling array values while reading from elasticsearch in spark using elasticsearch-spark


I am trying to read from elasticsearch in spark using es-hadoop library. I know while reading from es-hadoop, we need to pass option es.read.field.as.array.include as in code ApacheSpark.sqlContext.read.format("org.elasticsearch.spark.sql").option("es.nodes", "xx.xxx.xxx").option("es.nodes.client.only", false).option("pushdown", true).option("es.read.field.as.array.include", "tags,fields.component,log.flags,ecs,message").load("ds2-hue_error-2020.09.22")
to handle fields with array values. But for that we need to know prior which document fields in elasticsearch index contains array and dimension of that array before calling spark read api. Is there any way of knowing which es-hadoop fields need to passed as array, otherwise it will throw error while reading.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.