Continuously streaming documents out of ElasticSearch using Spark Streaming

Hi:

Is there a way to continuously stream data out of elastic search using spark streaming ? I've taken a look at the documentation (http://www.elastic.co/guide/en/elasticsearch/hadoop/master/spark.html) which has an example of a query and apparently uses spark context (not spark streaming context). The document states that it will stream all documents from es based on the query, but i am looking for continuous streaming of out elastic search.

Thanks

5 Likes

Hi, the only way to stream out changes from ES cluster is changes feed plugin, and it works. I have created mini-documentation how to achieve uni-directional streaming replication between 2 ES clusters, separated by high-latency network

The only tools used: changes feed plugin, fluentd with some custom plugins I wrote (you actually don't need anything else)

Good luck

1 Like