ES-hadoop serialize org.apache.hadoop.io.ShortWritable failed

Hi

I am using pyspark with es-hadoop process es data

ES 7.4.0

spark 2.3.1

PUT test
{
    "mappings": {
        "properties": {
            "price": {
                "type": "short"
            }
        }
    }
}

PUT test/_doc/1
{
    "price": 1
}

pyspark --driver-class-path ~/jars/elasticsearch-hadoop-7.4.0.jar --jars ~/jars/elasticsearch-hadoop-7.4.0.jar

conf = {
    'es.resource': 'test',
    "es.nodes.wan.only": "true",
    "es.nodes": 'http://localhost:9200',
    "es.port": '9200',
    'es.net.http.auth.user': '',
    "es.net.http.auth.pass": '',
}
rdd = sc.newAPIHadoopRDD(inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
                         keyClass="org.apache.hadoop.io.NullWritable",
                         valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
                         conf=conf)
"""
ERROR:
Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.io.ShortWritable
Serialization stack:
  - object not serializable (class: org.apache.hadoop.io.ShortWritable, value: 1)
  - writeObject data (class: java.util.HashMap)
  - object (class java.util.HashMap, {price=1})
  - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
  - object (class scala.Tuple2, (1,{price=1}))
  - element of array (index: 0)
  - array (class [Lscala.Tuple2;, size 1); not retrying
Traceback (most recent call last):
"""

When i change short to long, i got the correct es data, why short type not serializable?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.