[Hadoop]Writing to Elastic thanks to Hive

Hi all,

Hadoop 2.5.1
ElasticSearch-Hadoop : 2.0.2
OS : Centos 7

My hive script below basically read data from one Elastic instance to
process it. Then it writes it back to an other Elastic instance. The
process phase is not yet implemented.

ADD JAR hdfs:
//adress_ip:port/path/elasticsearch-hadoop-2.0.2.BUILD-SNAPSHOT.jar;

-- Reading
CREATE EXTERNAL TABLE in (clientip STRING, bytes BIGINT)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'access-2015.01.30/apache-access', 'es.nodes'
= 'ip_address:port');

-- Writing
CREATE EXTERNAL TABLE out (client STRING, BIGINT)
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'access/clientrequest', 'es.nodes'=
'ip_address:port');

INSERT OVERWRITE TABLE out
SELECT *
FROM cin;

It works when my table out only has one column. However i'm getting this
error as soon as i try to add more columns :

... error parsing conf job.xml
... lineNumber: 640; columnNumber: 51; referenced character "&#

Did someone already have the same problem?

Thank
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

That's due to a bug in Hive (I assume you are running version 0.13).
This has been fixed in the dev version which is available from maven [1]

[1] Installation | Elasticsearch for Apache Hadoop [master] | Elastic

On 4/9/15 6:33 PM, valentin.dupont222@gmail.com wrote:

Hi all,

Hadoop 2.5.1
ElasticSearch-Hadoop : 2.0.2
OS : Centos 7

My hive script below basically read data from one Elastic instance to process it. Then it writes it back to an other
Elastic instance. The process phase is not yet implemented.

|

ADD JAR hdfs://adress_ip:port/path/elasticsearch-hadoop-2.0.2.BUILD-SNAPSHOT.jar;

--Reading
CREATE EXTERNAL TABLE in(clientip STRING,bytes BIGINT)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='access-2015.01.30/apache-access','es.nodes'='ip_address:port');

--Writing
CREATE EXTERNAL TABLE out(client STRING, BIGINT)
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource'='access/clientrequest','es.nodes'='ip_address:port');

INSERT OVERWRITE TABLE out
SELECT *
FROM cin;
|

It works when my table out only has one column. However i'm getting this error as soon as i try to add more columns :

... error parsing conf job.xml
... lineNumber: 640; columnNumber: 51; referenced character "&#

Did someone already have the same problem?

Thank
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5526B776.8070602%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

I was running the 0.14 version. I upgraded Hive to the 1.1.0 version and it
worked.
Thanks a lot !

Valentin

Le jeudi 9 avril 2015 19:31:47 UTC+2, Costin Leau a écrit :

That's due to a bug in Hive (I assume you are running version 0.13).
This has been fixed in the dev version which is available from maven [1]

[1]
Installation | Elasticsearch for Apache Hadoop [master] | Elastic

On 4/9/15 6:33 PM, valentin....@gmail.com <javascript:> wrote:

Hi all,

Hadoop 2.5.1 
ElasticSearch-Hadoop : 2.0.2 
OS : Centos 7 

My hive script below basically read data from one Elastic instance to
process it. Then it writes it back to an other
Elastic instance. The process phase is not yet implemented.

|

ADD JAR
hdfs://adress_ip:port/path/elasticsearch-hadoop-2.0.2.BUILD-SNAPSHOT.jar;

--Reading
CREATE EXTERNAL TABLE in(clientip STRING,bytes BIGINT)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES('es.resource'='access-2015.01.30/apache-access','es.nodes'='ip_address:port');

--Writing
CREATE EXTERNAL TABLE out(client STRING, BIGINT)
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES('es.resource'='access/clientrequest','es.nodes'='ip_address:port');

INSERT OVERWRITE TABLE out
SELECT *
FROM cin;
|

It works when my table out only has one column. However i'm getting this
error as soon as i try to add more columns :

... error parsing conf job.xml
... lineNumber: 640; columnNumber: 51; referenced character "&#

Did someone already have the same problem?

Thank
Valentin

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/973bee8f-c492-4b4c-83fa-684b3fa80a93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The latest dev build (not stable but dev - see the link I posted) of es-hadoop works with Hive 0.14 as well.

Happy to hear you have solved it by upgrading Hive.

Cheers,

On 4/10/15 11:14 AM, valentin.dupont222@gmail.com wrote:

I was running the 0.14 version. I upgraded Hive to the 1.1.0 version and it worked.
Thanks a lot !

Valentin

Le jeudi 9 avril 2015 19:31:47 UTC+2, Costin Leau a écrit :

That's due to a bug in Hive (I assume you are running version 0.13).
This has been fixed in the dev version which is available from maven [1]

[1] http://www.elastic.co/guide/en/elasticsearch/hadoop/master/install.html#download-dev
<http://www.elastic.co/guide/en/elasticsearch/hadoop/master/install.html#download-dev>

On 4/9/15 6:33 PM, valentin....@gmail.com <javascript:> wrote:
> Hi all,
>
>
>     Hadoop 2.5.1
>     ElasticSearch-Hadoop : 2.0.2
>     OS : Centos 7
>
>
> My hive script below basically read data from one Elastic instance to process it. Then it writes it back to an other
> Elastic instance. The process phase is not yet implemented.
>
> |
>
> ADD JAR hdfs://adress_ip:port/path/elasticsearch-hadoop-2.0.2.BUILD-SNAPSHOT.jar;
>
> --Reading
> CREATE EXTERNAL TABLE in(clientip STRING,bytes BIGINT)
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> TBLPROPERTIES('es.resource'='access-2015.01.30/apache-access','es.nodes'='ip_address:port');
>
> --Writing
> CREATE EXTERNAL TABLE out(client STRING, BIGINT)
> ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> TBLPROPERTIES('es.resource'='access/clientrequest','es.nodes'='ip_address:port');
>
>
> INSERT OVERWRITE TABLE out
> SELECT *
> FROM cin;
> |
>
> It works when my table out only has one column. However i'm getting this error as soon as i try to add more columns :
>
> ... error parsing conf job.xml
> ... lineNumber: 640; columnNumber: 51; referenced character "&#
>
> Did someone already have the same problem?
>
> Thank
> Valentin
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/73041225-0c9d-4705-8f4b-6ff99f887459%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/973bee8f-c492-4b4c-83fa-684b3fa80a93%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/973bee8f-c492-4b4c-83fa-684b3fa80a93%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5527B38C.5010204%40gmail.com.
For more options, visit https://groups.google.com/d/optout.