WARN ScalaRowValueReader: Field 'hits.hits._score' is backed by an array but the associated Spark Schema does not reflect this

Madhur_Chopra · September 27, 2018, 7:55pm

i am trying to access _id field from metadata

@udf('string')
def get_id(metadata):
    return metadata['_id']

def get_index(server,index_name,query):
    df = sqlContext.read.format("org.elasticsearch.spark.sql")\
            .option("es.nodes",server)\
            .option("es.query",query)\
            .option("es.read.metadata","true")\
            .option("es.read.metadata.field","metadata")\
            .option("es.read.field.empty.as.null","yes")\
            .option("es.index.read.missing.as.empty","yes")\
            .option("es.read.field.validate.presence","ignore")\
            .load(index_name)
            
    return df.withColumn('Id',get_id(df.metadata)) 

query = """{"query":{ "match_all": {}}}"""

The strange part is when i call "get_index()" and on returned dataframe, I do .select('Id') i see this warning, but when i select all the fields, i do not see this warning.

i have tried adding es.read.field.as.array.include hits.hits._score but the warning doesnt seem to go away.

Thanks in advance for help

system · October 25, 2018, 7:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Field 'tags' is backed by an array but the associated Spark Schema does not reflect this Elasticsearch	1	568	September 3, 2019
Spark elasticsearch 5.0.2 scala.MatchError Elasticsearch es-hadoop	2	2304	January 9, 2017
Handling array values while reading from elasticsearch in spark using elasticsearch-spark Elasticsearch es-hadoop	1	934	November 19, 2020
Best practise to read ES from PySpark Elasticsearch es-hadoop	5	7157	April 14, 2018
Es.read.field.as.array.include multiple values Elasticsearch es-hadoop	2	1081	June 11, 2020

WARN ScalaRowValueReader: Field 'hits.hits._score' is backed by an array but the associated Spark Schema does not reflect this

Related topics