Elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

My previous idea doesn't seem to work. Cannot send documents directly to
_bulk only to "index/type" pattern

On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:

Hi,

I work on a complex workflow using Spark (Parsing, Cleaning, Machine
Learning....).
At the end of the workflow I want to send aggregated results to
elasticsearch so my portal could query data.
There will be two types of processing: streaming and the possibility to
relaunch workflow on all available data.

Right now I use elasticsearch-hadoop and particularly the spark part to
send document to elasticsearch with the saveJsonToEs(myindex, mytype)
method.
The target is to have an index by day using the proper template that we
build.
AFAIK you could not add consideration of a feature in a document to send
it to the proper index in elasticsearch-hadoop.

What is the proper way to implement this feature?
Have a special step useing spark and bulk so that each executor send
documents to the proper index considering the feature of each line?
Is there something that I missed in elasticsearch-hadoop?

Julien

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f01bc8d0-0c04-4c82-8ddf-dc301b06179c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.