Elasticsearch-hadoop inserting query as field

avlach · October 17, 2017, 7:17am

Hello,
I am trying to use elasticsearch-hadoop v5.6.2 with pyspark and I have the following issue. The sample code is very simple just to det the hang of it:

df = spark.read.format('org.elasticsearch.spark.sql').load('index/type')
df.printSchema()

What I get as a result contains the following:

|-- query: struct (nullable = true)
|    |-- match_all: struct (nullable = true)

which does not exist in my index mapping leading to an exception being raised. Sometimes having ran a job that contained a filter and trying to run a job after that without filtering adds the filtering query in the schema like this

|-- query: struct (nullable = true)
 |    |-- bool: struct (nullable = true)
 |    |    |-- filter: struct (nullable = true)
 |    |    |    |-- exists: struct (nullable = true)
 |    |    |    |    |-- field: string (nullable = true)
 |    |    |    |-- term: struct (nullable = true)
 |    |    |    |    |-- http_user: string (nullable = true)
 |    |    |-- must: struct (nullable = true)
 |    |    |    |-- match_all: struct (nullable = true)
 |    |-- match_all: struct (nullable = true)
 |-- received_from: string (nullable = true)

Jobs were ran in different indexes both in local mode (--master=local) and in cluster mode using mesos as cluster manager. What could be the issue?

james.baiera · October 31, 2017, 4:19pm

If you configure your index with dynamic mapping disabled, then any call that tries to add new fields to the mapping without calling the mapping end point directly should fail. It's possible that something is sending a query as part of an index request instead of a search request.

system · November 28, 2017, 4:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Spark with elasticsearch hadoop Elasticsearch es-hadoop	2	717	July 4, 2017
Elasticsearch.spark.sql queries return null Elasticsearch es-hadoop	1	730	June 3, 2019
Query elasticseatrch with pyspark and nested fields Elasticsearch	0	12	December 10, 2024
Pyspark write to elasticsearch with empty fields Elasticsearch	1	1187	February 6, 2020
Elasticsearch-spark-30 read missing field(double type) error Elasticsearch es-hadoop	9	1354	December 31, 2022

Elasticsearch-hadoop inserting query as field

Related topics