Indexing JSON with nested fields

I've a nested json data with nested fields that I want to extract and construct a Scala Map.

Heres the sample JSON:

"nested_field": [
  {
    "airport": "sfo",
    "score": 1.0
  },
  {
    "airport": "phx",
    "score": 1.0
  },
  {
    "airport": "sjc",
    "score": 1.0
  }
]

I want to use saveToES() and construct a Scala Map to index the field into ES index with mapping as below:

 "nested_field": {
    "properties": {
      "score": {
        "type": "double"
      },
      "airport": {
        "type": "keyword",
        "ignore_above": 1024
      }
    }
  }

The json file is read into the dataframe using spark.read.json("example.json"). Whats the right way to construct the Scala Map in this case?

Thanks for any help!

This seems more like a generic Spark question. If you're asking how to parse JSON using Scala, that can be done hundreds of different ways, all of which are reasonable solutions. That's generally not a helpful tip though so to give you a starting spot: ES-Hadoop uses the Jackson JSON libraries to parse JSON into objects and vice versa. I would peruse that library as a good place to start.

As for ensuring the mapping is what you want, I would suggest either creating the index before running your job with your desired mapping. If you don't want to precreate the index every time and would rather ES-Hadoop do that, you can create an index template in Elasticsearch that will assign the mapping you want to any index who's name matches its pattern.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.