Unable to execute mtermvectors elasticsearch query from AWS EMR cluster using Spark

Hello,
I am trying to execute this elasticsearch query via spark:

    POST /aa6/_mtermvectors  
    {
      "ids": [
        "ABC",
        "XYA",
        "RTE"
      ],
      "parameters": {
        "fields": [
          "attribute"
        ],
        "term_statistics": true,
        "offsets": false,
        "payloads": false,
        "positions": false
      }
    }

The code that I have written in Zeppelin is :

def createString():String = {
    return s"""_mtermvectors {
  "ids": [
    "ABC",
    "XYA",
    "RTE"
  ],
  "parameters": {
    "fields": [
      "attribute"
    ],
    "term_statistics": true,
    "offsets": false,
    "payloads": false,
    "positions": false
    }
  }"""
}

import org.elasticsearch.spark._
sc.esRDD("aa6", "?q="+createString).count   

I get the error :

org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: org.elasticsearch.hadoop.rest.EsHadoopRemoteException: parse_exception: parse_exception: Encountered " <RANGE_GOOP> "["RTE","XYA","ABC" "" at line 1, column 22.
Was expecting:
"TO" ...

{"query":{"query_string":{"query":"_mtermvectors {\"ids\": [\"RTE\",\"ABC\",\"XYA\"], \"parameters\": {\"fields\": [\"attribute\"], \"term_statistics\": true, \"offsets\": false, \"payloads\": false, \"positions\": false } }"}}}
	at org.elasticsearch.hadoop.rest.RestClient.checkResponse(RestClient.java:477)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:408)

This is probably something simple but I am unable to find a way to set the request body while making the spark call

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.