Bug when reading if a field has no mapping (empty array by ex.)

A tricky bug (or I miss a setting maybe ..), with ES 2.4.4 and ES4Hadoop 2.4.4, let's create an index:

PUT /toto/events/1
{
   "toto": 1,
   "data": [],
   "name": "webapp_loaded"
}

This will generate the mapping:

{
  "toto" : {
    "mappings" : {
      "events" : {
        "properties" : {
          "name" : {
            "type" : "string"
          },
          "toto" : {
            "type" : "long"
          }
        }
      }
    }
  }
}

Note: field data has no mapping, because it's an empty array I guess. Now, let's query this with Spark:

scala> sc.esRDD("toto").first.toString
res22: String = (1,Map(toto -> 1))

Where is the name and data fields? It looks like the parser "ignore" fields after data.

It seems fixed with es4Hadoop 5.0.0

Is 2.4.4 stable or should I use 5.0.0 on production, even with an elasticsearch cluster version 2.4.4 ?

Thanks you,

@ebuildy Linking your github issue about this here as well: https://github.com/elastic/elasticsearch-hadoop/issues/946

To also repeat the outcome from the ticket: The issue here seems to be the fact that you're using just the index name and no type name. I'll add a configuration validation stage to make sure that types are set, but this may have to wait for an appropriate release since it is a contract change to acceptable values in a configuration.

Yeah validation could save life :wink:

Thanks you,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.