ES-Spark's esRDD method returns the raw document (_source, in ElasticSearch terms) and the document's id (_id in ES), but I also need additional information regarding the returned documents, such as the index name and type each document comes from.
I am querying multiple indices, i.e. my call to esRDD looks like this:
sparkContext.esRDD("index*/entities", query)
and the actual indices are "index1", "index2", etc. So, I want to know which specific index each of the documents in the resulting RDD came from.
Can this be done?
Thanks