ElasticSearch+Hadoop+Spark


(János Háber) #1

Hi guys,

I writing a spark application where I want to use ES with Hadoop. I have a
lot of document in ES now I want to aggregate but I can't.
My document's have different fields which means some have "twitter" field,
with values, some have "facebook" etc

When I try to read the data from ES I got an exception:
java.lang.NullPointerException
at
org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:110)
at
org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at
org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at
org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at
org.elasticsearch.hadoop.serialization.dto.mapping.Field.toLookupMap(Field.java:98)
at
org.elasticsearch.hadoop.serialization.ScrollReader.(ScrollReader.java:61)
at
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.init(EsInputFormat.java:223)
at
org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.init(EsInputFormat.java:367)
at
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.(EsInputFormat.java:183)
at
org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.(EsInputFormat.java:359)
at
org.elasticsearch.hadoop.mr.EsInputFormat.getRecordReader(EsInputFormat.java:498)
at
org.elasticsearch.hadoop.mr.EsInputFormat.getRecordReader(EsInputFormat.java:72)

My question:

  • how can I read back the raw json from es query without ES-Hadoop try to
    deserialize it (I want to manual deserialization)?
  • If I can't do that, ES return an "Object" in this field mapping and the
    json contain an empty object "{}". How an I ignore this?

Thanks

b0c1

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6b31f532-1798-44f8-913a-0b56dbe2d2dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #2

Hi,

Issue #231 which I believe you have raised, has been fixed in 2.x - can you please try the latest 2.0.1.BUILD-SNAPSHOT
and report back?

Thanks!

On 7/15/14 9:32 AM, János Háber wrote:

Hi guys,

I writing a spark application where I want to use ES with Hadoop. I have a lot of document in ES now I want to aggregate
but I can't.
My document's have different fields which means some have "twitter" field, with values, some have "facebook" etc

When I try to read the data from ES I got an exception:
java.lang.NullPointerException
at org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:110)
at org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at org.elasticsearch.hadoop.serialization.dto.mapping.Field.add(Field.java:111)
at org.elasticsearch.hadoop.serialization.dto.mapping.Field.toLookupMap(Field.java:98)
at org.elasticsearch.hadoop.serialization.ScrollReader.(ScrollReader.java:61)
at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.init(EsInputFormat.java:223)
at org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.init(EsInputFormat.java:367)
at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.(EsInputFormat.java:183)
at org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.(EsInputFormat.java:359)
at org.elasticsearch.hadoop.mr.EsInputFormat.getRecordReader(EsInputFormat.java:498)
at org.elasticsearch.hadoop.mr.EsInputFormat.getRecordReader(EsInputFormat.java:72)

My question:

  • how can I read back the raw json from es query without ES-Hadoop try to deserialize it (I want to manual deserialization)?
  • If I can't do that, ES return an "Object" in this field mapping and the json contain an empty object "{}". How an I
    ignore this?

Thanks

b0c1

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6b31f532-1798-44f8-913a-0b56dbe2d2dd%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6b31f532-1798-44f8-913a-0b56dbe2d2dd%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53C53347.6050808%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3