Spark sql not reading all the columns from index

raghu2 · January 17, 2020, 8:25pm

I am using pyspark 1.6.1 using elasticsearch-spark-13_2.10-7.5.1.jar to read data from ES 5.6.8 running on AWS Es service.
I am able to use "es.read.field.include" to extract only the columns we need or also register the index as temp table and select only the columns we need.
The source index mapping is not auto updated and any columns that are not in the index mapping are not available to extract from spark.
How do we read all the columns from ES using spark.
I tried to pass es.query = {"query": {"match_all": {}} }" as well but still it uses the index mapping for the schema.
Is there a way i can extract the whole index and create a data frame on that data.

james.baiera · January 28, 2020, 9:44pm

Can you share the mapping from your Elasticsearch index?

The source index mapping is not auto updated and any columns that are not in the index mapping are not available to extract from spark.

I'm a little confused about this, can you elaborate how you have your index set up? ES-Hadoop depends on up to date mappings on Elasticsearch in order to determine what data type each field is.

system · February 25, 2020, 9:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

system · June 1, 2020, 11:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How ES schema is determined while reading using hadoop Elasticsearch es-hadoop	3	909	March 11, 2019
Best practise to read ES from PySpark Elasticsearch es-hadoop	5	7264	April 14, 2018
Can we read data from index into spark by query? Elasticsearch es-hadoop	4	996	July 6, 2017
Found duplicate column(s) in the data schema, Need help on how to load such index data into Spark Dataframe Elasticsearch es-hadoop	2	14333	March 11, 2019
Question about Reading Flattened Field from ES Index to Pyspark Elasticsearch es-hadoop	3	561	March 15, 2022

Spark sql not reading all the columns from index

Related topics