Why doesn’t dense_vector field show up in Spark schema when using Elasticsearch-Hadoop?

dany_fard · September 7, 2025, 6:36am

Hi everyone,

I created an Elasticsearch index with a dense_vector field, along with some text fields. The mapping looks like this (simplified):

{"mappings": {"properties": {"embedding": {"type": "dense_vector","dims": 3,"index": true,"index_options": { "type": "int8_hnsw" }},"title": { "type": "text" },"text": { "type": "text" }}}}

When I read this index in Spark using the Elasticsearch for Apache Hadoop connector:

df = spark.read.format("es").option("es.nodes", "172.22.10.20").option("es.port", "9200").option("es.nodes.wan.only", "true").load("bbb")

df.printSchema()

df = spark.read.format("es").option("es.nodes", "172.22.10.20").option("es.port", "9200").option("es.nodes.wan.only", "true").load("bbb")

df.printSchema()

the output only shows:

root
|-- text: string (nullable = true)
|-- title: string (nullable = true)

The embedding (dense_vector) field is completely missing.

My Questions

Is dense_vector officially unsupported in the ES-Hadoop connector?
The documentation on supported field mappings doesn’t mention vector types. Does that mean they are silently ignored?
Is there any workaround to read these fields into Spark (e.g., as arrays of floats), or is duplicating the field into a regular float array the only option?

Thanks in advance for clarifying!

Keith_Massey · September 8, 2025, 1:13pm

Unfortunately dense_vector is one of the many unsupported field types: Support for all Elasticsearch field types · Issue #1813 · elastic/elasticsearch-hadoop · GitHub . Your best option might be to set es.output.json to true to dump out the raw json.

dany_fard · September 9, 2025, 1:03pm

Thank you very much for taking the time to answer my question.

Topic		Replies	Views
Best practice elasticsearch index schema for Spark SQL Elasticsearch es-hadoop	2	1782	July 6, 2017
ElasticSearch+Hadoop+Spark Elasticsearch	2	979	July 6, 2017
Spark with elasticsearch hadoop Elasticsearch es-hadoop	2	727	July 4, 2017
No handler for type [dense_vector] for version 7.4.2 Elasticsearch	4	3777	April 14, 2020
Elasticsearch-hadoop inserting query as field Elasticsearch es-hadoop	2	746	November 28, 2017

Why doesn’t dense_vector field show up in Spark schema when using Elasticsearch-Hadoop?

My Questions

Related topics