Read elasticsearch with Cascading


(Thomas Decaux) #1

Hello guys,

I would like export elasticsearch to json via Cascading (or any hadoop tools, but I try cascading right now).

    Properties props = new Properties();
    props.setProperty("es.nodes.wan.only", "true");
    props.setProperty("es.output.json", "true");

    Tap events = new EsTap("127.0.0.1", 9200, "test/test", "?q=*");

Gives:

Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
at org.elasticsearch.hadoop.cascading.EsLocalScheme.source(EsLocalScheme.java:157)
at cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:166)
at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:139)
... 7 more

Which seems normal because: https://github.com/elastic/elasticsearch-hadoop/blob/2.4/cascading/src/main/java/org/elasticsearch/hadoop/cascading/EsLocalScheme.java#L157


(James Baiera) #2

This looks like a similar issue to the one you posted about Pig. After taking a quick look at the code, I think it's also definitely a bug. I've opened an issue here.

With this showing up twice now, I'm also planning on doing a comprehensive review of the es.output.json feature. It seems to be a feature that is lacking in it's tests.


(system) #3