Unable to execute mtermvectors elasticsearch query from AWS EMR cluster using Spark

Salil_Surendran · July 12, 2020, 5:22am

Hello,
I am trying to execute this elasticsearch query via spark:

    POST /aa6/_mtermvectors  
    {
      "ids": [
        "ABC",
        "XYA",
        "RTE"
      ],
      "parameters": {
        "fields": [
          "attribute"
        ],
        "term_statistics": true,
        "offsets": false,
        "payloads": false,
        "positions": false
      }
    }

The code that I have written in Zeppelin is :

def createString():String = {
    return s"""_mtermvectors {
  "ids": [
    "ABC",
    "XYA",
    "RTE"
  ],
  "parameters": {
    "fields": [
      "attribute"
    ],
    "term_statistics": true,
    "offsets": false,
    "payloads": false,
    "positions": false
    }
  }"""
}

import org.elasticsearch.spark._
sc.esRDD("aa6", "?q="+createString).count

I get the error :

org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: org.elasticsearch.hadoop.rest.EsHadoopRemoteException: parse_exception: parse_exception: Encountered " <RANGE_GOOP> "["RTE","XYA","ABC" "" at line 1, column 22.
Was expecting:
"TO" ...

{"query":{"query_string":{"query":"_mtermvectors {\"ids\": [\"RTE\",\"ABC\",\"XYA\"], \"parameters\": {\"fields\": [\"attribute\"], \"term_statistics\": true, \"offsets\": false, \"payloads\": false, \"positions\": false } }"}}}
	at org.elasticsearch.hadoop.rest.RestClient.checkResponse(RestClient.java:477)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428)
	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:408)

This is probably something simple but I am unable to find a way to set the request body while making the spark call

system · August 9, 2020, 5:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can I get _mtermvector Elasticsearch es-hadoop	2	1070	July 6, 2017
Unable to integrate Spark on EMR with Amazon ELasticsearch Elasticsearch es-hadoop	2	2494	March 2, 2017
Pyspark - read data from elasticsearch cluster on EMR Elasticsearch es-hadoop	4	2182	January 9, 2019
Spark - querying ElasticSearch cluster over a RDD Elasticsearch es-hadoop	5	2273	July 6, 2017
[Hadoop] : Parsing error in MR integration Elasticsearch	4	464	July 6, 2017

Unable to execute mtermvectors elasticsearch query from AWS EMR cluster using Spark

Related topics