Better places to store es.nodes and es.port in ES Hive integration?

Hi,
I am playing with elasticsearch and hive integration. The documentation
says
to set configuration like es.nodes, es.port in TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set to index to
the same es cluster,
I would have to repeat this information ten times in TBLPROPERTIES. Even
if
I use var substitution I still have to rwrite this subtititiov var for
each table definition.
What I am looking for is to put these info in say one file and pass the
location, in some way, to hive cli
so hive elasticsearch will get these settings when trying to find es server
to talk to.
I am not looking into put these info into files like hive-site.xml.

Thanks,

Jack

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Could you please raise an issue with some type of example? Due to the way Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

The configuration needs to be created before a job is submitted which in practice means "dynamic configurations"
are basically impossible (this also has some security implications which are simply avoided this way).
Thus either one specifies the configuration manually or loads a known location file (hive-site.xml, core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading, etc... unless one adds a pre-processor to the job content (script,
flow, etc...)
by the time es-hadoop kicks in, the job is already running and thus its changes discarded.

Cheers,

On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

Hi,
I am playing with elasticsearch and hive integration. The documentation says
to set configuration like es.nodes, es.port in TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set to index to the same es cluster,
I would have to repeat this information ten times in TBLPROPERTIES. Even if
I use var substitution I still have to rwrite this subtititiov var for each table definition.
What I am looking for is to put these info in say one file and pass the location, in some way, to hive cli
so hive elasticsearch will get these settings when trying to find es server to talk to.
I am not looking into put these info into files like hive-site.xml.

Thanks,

Jack

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/539D6507.3080207%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Costin,
I am aiming at modifying the existing hadoop cluster and hive installation
and also modularizing some common es.* properies in a separate common
place. I know the first goal can be achieved with hive cli --auxpath
option and hive table's TBLPROPERTERTIES. For the secon goal, I am able to
move some es.* settings from TBLPROPERTIES declaration to hive's set
statments. For example, I can put

set es.nodes=my.domain.com

in the same hql file then skip es.nodes setting in TBLPROPERTIES in the
external table delcarations in the SAME hql. But I wish I can move the set
statetemnt in a separate file. I now realize this is rather a hive
question.
Regards,
Jack

On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau costin.leau@gmail.com wrote:

Could you please raise an issue with some type of example? Due to the way
Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

The configuration needs to be created before a job is submitted which in
practice means "dynamic configurations"
are basically impossible (this also has some security implications which
are simply avoided this way).
Thus either one specifies the configuration manually or loads a known
location file (hive-site.xml, core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading, etc... unless one adds
a pre-processor to the job content (script, flow, etc...)
by the time es-hadoop kicks in, the job is already running and thus its
changes discarded.

Cheers,

On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

Hi,
I am playing with elasticsearch and hive integration. The documentation
says
to set configuration like es.nodes, es.port in TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set to index to
the same es cluster,
I would have to repeat this information ten times in TBLPROPERTIES.
Even if
I use var substitution I still have to rwrite this subtititiov var for
each table definition.
What I am looking for is to put these info in say one file and pass the
location, in some way, to hive cli
so hive elasticsearch will get these settings when trying to find es
server to talk to.
I am not looking into put these info into files like hive-site.xml.

Thanks,

Jack

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/539D6507.3080207%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
-- Jinyuan (Jack) Zhou

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCGjBAg5k5R_uz6P3DAuDKXax7A5qPSsd9Kf2gEqtSZZ2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Just share a solution I learned hive side.

hive cli has an -i option that takes a file of hive commands to initilize
the session.
so I can put a list of set comand as well as add jar ... command in one
file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql. Note table
creation hql inside myscript.hql don't have to set es.* properties as long
as it appears in init.hive file This solves my problem.
Thanks,

Jinyuan (Jack) Zhou

On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou zhou.jinyuan@gmail.com
wrote:

Thanks Costin,
I am aiming at modifying the existing hadoop cluster and hive
installation and also modularizing some common es.* properies in a
separate common place. I know the first goal can be achieved with hive cli
--auxpath option and hive table's TBLPROPERTERTIES. For the secon goal, I
am able to move some es.* settings from TBLPROPERTIES declaration to
hive's set statments. For example, I can put

set es.nodes=my.domain.com

in the same hql file then skip es.nodes setting in TBLPROPERTIES in the
external table delcarations in the SAME hql. But I wish I can move the set
statetemnt in a separate file. I now realize this is rather a hive
question.
Regards,
Jack

On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau costin.leau@gmail.com
wrote:

Could you please raise an issue with some type of example? Due to the way
Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

The configuration needs to be created before a job is submitted which in
practice means "dynamic configurations"
are basically impossible (this also has some security implications which
are simply avoided this way).
Thus either one specifies the configuration manually or loads a known
location file (hive-site.xml, core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading, etc... unless one adds
a pre-processor to the job content (script, flow, etc...)
by the time es-hadoop kicks in, the job is already running and thus its
changes discarded.

Cheers,

On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

Hi,
I am playing with elasticsearch and hive integration. The documentation
says
to set configuration like es.nodes, es.port in TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set to index
to the same es cluster,
I would have to repeat this information ten times in TBLPROPERTIES.
Even if
I use var substitution I still have to rwrite this subtititiov var for
each table definition.
What I am looking for is to put these info in say one file and pass the
location, in some way, to hive cli
so hive elasticsearch will get these settings when trying to find es
server to talk to.
I am not looking into put these info into files like hive-site.xml.

Thanks,

Jack

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/539D6507.3080207%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
-- Jinyuan (Jack) Zhou

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for sharing - can you also give an example of the table initialization in init.hive vs myscript.hql?

Cheers!

On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

Just share a solution I learned hive side.

hive cli has an -i option that takes a file of hive commands to initilize the session.
so I can put a list of set comand as well as add jar ... command in one file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql. Note table creation hql inside myscript.hql don't have to
set es.* properties as long as it appears in init.hive file This solves my problem.
Thanks,

Jinyuan (Jack) Zhou

On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <zhou.jinyuan@gmail.com mailto:zhou.jinyuan@gmail.com> wrote:

Thanks Costin,
I am aiming at modifying  the existing hadoop cluster and hive installation and also modularizing   some common es.*
properies in a separate common place.  I know the first goal can be achieved with hive cli  --auxpath option  and
hive table's TBLPROPERTERTIES. For the secon goal, I am able to move  some es.* settings from TBLPROPERTIES
declaration to hive's set statments. For example, I can put

    set es.nodes=my.domain.com <http://my.domain.com>

in the same hql file  then skip es.nodes setting in TBLPROPERTIES in the external table delcarations in the SAME
hql. But I wish  I can move the set statetemnt in a separate file. I now realize this is rather a  hive question.
Regards,
Jack


On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>> wrote:

    Could you please raise an issue with some type of example? Due to the way Hadoop (and Hive) works,
    things tend to be tricky in terms of configuring a job.

    The configuration needs to be created before a job is submitted which in practice means "dynamic configurations"
    are basically impossible (this also has some security implications which are simply avoided this way).
    Thus either one specifies the configuration manually or loads a known location file (hive-site.xml,
    core-site.xml...)
    upfront, before the job is submitted.
    This means when dealing with Hive, Pig, Cascading, etc... unless one adds a pre-processor to the job content
    (script, flow, etc...)
    by the time es-hadoop kicks in, the job is already running and thus its changes discarded.

    Cheers,

    On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

        Hi,
        I am playing with elasticsearch and hive integration. The documentation says
        to set configuration like es.nodes, es.port  in TBLPROPERTIES. It works.
        But it can cause many reduntant codes. If I have ten data set to index to the same es cluster,
           I would have to repeat this information ten times in TBLPROPERTIES. Even if
           I use var substitution I still have to rwrite this subtititiov var for  each table definition.
        What I am looking for is to put these info in say one file and  pass the location, in some way, to hive cli
        so hive elasticsearch will get these settings when trying to find es server to talk to.
        I am not looking into put these info into files like  hive-site.xml.

        Thanks,

        Jack

        --
        You received this message because you are subscribed to the Google Groups "elasticsearch" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to
        elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
        <mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
        To view this discussion on the web visit
        https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>
        <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer
        <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com?utm_medium=email&utm_source=footer>>.
        For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.


    --
    Costin

    --
    You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>.
    To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@__googlegroups.com
    <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
    To view this discussion on the web visit
    https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/539D6507.3080207%40gmail.com>.
    For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.




--
-- Jinyuan (Jack) Zhou

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/539F5C5F.5050408%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

sure, I was able to run follwoing command against my remote es cluster.
hive -i init.hive -f search.hql.

Below is the contents of init.hive, search.hql and data file in hdfs
/user/cloudera/hivework/foobar/foobar.data

I replaced value for es.nodes with fake name. Other than that, it should
ran without problem. I am using feature called 'dynamic/mult resource
wirtes. It works in this example, but when I also add 'es.mapping.id' =
'id' setting. I got a the following error:

Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:
Unexpected character ('"' (code 34)): was expecting comma to separate
OBJECT entries at [Source: [B@7be1d686; line: 1, column: 53] at
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:300)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:278)

-----init.hive----

set es.nodes=my.remote.escluster;
set es.port=9200;
set es.index.auto.create=yes;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
set mapred.map.tasks.speculative.execution=false;
set mapred.reduce.tasks.speculative.execution=false;
set hive.mapred.reduce.tasks.speculative.execution=false;
add jar
/home/cloudera/elasticsearch-hadoop-2.0.0/dist/elasticsearch-hadoop-hive-2.0.0.jar;

-----search.hql----

use search;
DROP TABLE IF EXISTS foo;
CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/hivework/foobar';
select * from foo;
DROP TABLE IF EXISTS es_foo;
CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');

INSERT OVERWRITE TABLE es_foo SELECT * FROM foo;

----- /user/cloudera/hivework/foobar/foobar.data ---

1, bar1, first_bar
2, bar2, first_bar
3, foo_bar_1, second_bar
4, foo_bar_12, second_bar
~

Jinyuan (Jack) Zhou

On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau costin.leau@gmail.com wrote:

Thanks for sharing - can you also give an example of the table
initialization in init.hive vs myscript.hql?

Cheers!

On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

Just share a solution I learned hive side.

hive cli has an -i option that takes a file of hive commands to
initilize the session.
so I can put a list of set comand as well as add jar ... command in one
file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql. Note table
creation hql inside myscript.hql don't have to
set es.* properties as long as it appears in init.hive file This solves
my problem.
Thanks,

Jinyuan (Jack) Zhou

On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <zhou.jinyuan@gmail.com
mailto:zhou.jinyuan@gmail.com> wrote:

Thanks Costin,
I am aiming at modifying  the existing hadoop cluster and hive

installation and also modularizing some common es.*
properies in a separate common place. I know the first goal can be
achieved with hive cli --auxpath option and
hive table's TBLPROPERTERTIES. For the secon goal, I am able to move
some es.* settings from TBLPROPERTIES
declaration to hive's set statments. For example, I can put

    set es.nodes=my.domain.com <http://my.domain.com>


in the same hql file  then skip es.nodes setting in TBLPROPERTIES in

the external table delcarations in the SAME
hql. But I wish I can move the set statetemnt in a separate file. I
now realize this is rather a hive question.
Regards,
Jack

On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <costin.leau@gmail.com

mailto:costin.leau@gmail.com> wrote:

    Could you please raise an issue with some type of example? Due to

the way Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

    The configuration needs to be created before a job is submitted

which in practice means "dynamic configurations"
are basically impossible (this also has some security
implications which are simply avoided this way).
Thus either one specifies the configuration manually or loads a
known location file (hive-site.xml,
core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading, etc... unless
one adds a pre-processor to the job content
(script, flow, etc...)
by the time es-hadoop kicks in, the job is already running and
thus its changes discarded.

    Cheers,

    On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

        Hi,
        I am playing with elasticsearch and hive integration. The

documentation says
to set configuration like es.nodes, es.port in
TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten data set
to index to the same es cluster,
I would have to repeat this information ten times in
TBLPROPERTIES. Even if
I use var substitution I still have to rwrite this
subtititiov var for each table definition.
What I am looking for is to put these info in say one file
and pass the location, in some way, to hive cli
so hive elasticsearch will get these settings when trying to
find es server to talk to.
I am not looking into put these info into files like
hive-site.xml.

        Thanks,

        Jack

        --
        You received this message because you are subscribed to the

Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.

        To view this discussion on the web visit
        https://groups.google.com/d/__msgid/elasticsearch/7040c805-_

_e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com <
https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>.
For more options, visit https://groups.google.com/d/__optout
https://groups.google.com/d/optout.

    --
    Costin

    --
    You received this message because you are subscribed to a topic

in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/__topic/elasticsearch/__
1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
1WH7kOD3uKs/unsubscribe>.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@__googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com.

    To view this discussion on the web visit
    https://groups.google.com/d/__msgid/elasticsearch/539D6507._

_3080207%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539D6507.
3080207%40gmail.com>.
For more options, visit https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>.

--
-- Jinyuan (Jack) Zhou

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-
ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-
ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com?utm_
medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/539F5C5F.5050408%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Most likely the some of your data contains some invalid entries which result in an invalid JSON payload being sent to ES.
Check your ID values and/or keep an eye on issue #217 which aims to provide more human-friendly messages for the user.

Cheers.

On 6/17/14 2:42 AM, Jinyuan Zhou wrote:

sure, I was able to run follwoing command against my remote es cluster.
hive -i init.hive -f search.hql.

Below is the contents of init.hive, search.hql and data file in hdfs /user/cloudera/hivework/foobar/foobar.data

I replaced value for es.nodes with fake name. Other than that, it should ran without problem. I am using feature called
'dynamic/mult resource wirtes. It works in this example, but when I also add 'es.mapping.id http://es.mapping.id' =
'id' setting. I got a the following error:
/
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Unexpected character ('"' (code 34)): was expecting
comma to separate OBJECT entries
at [Source: [B@7be1d686; line: 1, column: 53]
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:300)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:278)/

-----init.hive----

set es.nodes=my.remote.escluster;
set es.port=9200;
set es.index.auto.create=yes;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
set mapred.map.tasks.speculative.execution=false;
set mapred.reduce.tasks.speculative.execution=false;
set hive.mapred.reduce.tasks.speculative.execution=false;
add jar /home/cloudera/elasticsearch-hadoop-2.0.0/dist/elasticsearch-hadoop-hive-2.0.0.jar;

-----search.hql----

use search;
DROP TABLE IF EXISTS foo;
CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/hivework/foobar';
select * from foo;
DROP TABLE IF EXISTS es_foo;
CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');

INSERT OVERWRITE TABLE es_foo SELECT * FROM foo;

----- /user/cloudera/hivework/foobar/foobar.data ---

1, bar1, first_bar
2, bar2, first_bar
3, foo_bar_1, second_bar
4, foo_bar_12, second_bar
~

Jinyuan (Jack) Zhou

On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau <costin.leau@gmail.com mailto:costin.leau@gmail.com> wrote:

Thanks for sharing - can you also give an example of the table initialization in init.hive vs myscript.hql?

Cheers!


On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

    Just share a solution  I learned  hive side.

    hive cli has an -i option that takes a  file of hive commands to initilize the session.
    so I can put a list of set comand as well as add jar ... command in one file, say inithive
    then run the cli as this:  hive -i init.hive -f myscript.hql.  Note table creation hql inside myscript.hql don't
    have to
    set es.* properties as long as it appears in init.hive file  This solves my problem.
    Thanks,


    Jinyuan (Jack) Zhou


    On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <zhou.jinyuan@gmail.com <mailto:zhou.jinyuan@gmail.com>
    <mailto:zhou.jinyuan@gmail.com <mailto:zhou.jinyuan@gmail.com>__>> wrote:

         Thanks Costin,
         I am aiming at modifying  the existing hadoop cluster and hive installation and also modularizing   some
    common es.*
         properies in a separate common place.  I know the first goal can be achieved with hive cli  --auxpath
    option  and
         hive table's TBLPROPERTERTIES. For the secon goal, I am able to move  some es.* settings from TBLPROPERTIES
         declaration to hive's set statments. For example, I can put

             set es.nodes=my.domain.com <http://my.domain.com> <http://my.domain.com>


         in the same hql file  then skip es.nodes setting in TBLPROPERTIES in the external table delcarations in the
    SAME
         hql. But I wish  I can move the set statetemnt in a separate file. I now realize this is rather a  hive
    question.
         Regards,
         Jack


         On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>
    <mailto:costin.leau@gmail.com <mailto:costin.leau@gmail.com>>__> wrote:

             Could you please raise an issue with some type of example? Due to the way Hadoop (and Hive) works,
             things tend to be tricky in terms of configuring a job.

             The configuration needs to be created before a job is submitted which in practice means "dynamic
    configurations"
             are basically impossible (this also has some security implications which are simply avoided this way).
             Thus either one specifies the configuration manually or loads a known location file (hive-site.xml,
             core-site.xml...)
             upfront, before the job is submitted.
             This means when dealing with Hive, Pig, Cascading, etc... unless one adds a pre-processor to the job
    content
             (script, flow, etc...)
             by the time es-hadoop kicks in, the job is already running and thus its changes discarded.

             Cheers,

             On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

                 Hi,
                 I am playing with elasticsearch and hive integration. The documentation says
                 to set configuration like es.nodes, es.port  in TBLPROPERTIES. It works.
                 But it can cause many reduntant codes. If I have ten data set to index to the same es cluster,
                    I would have to repeat this information ten times in TBLPROPERTIES. Even if
                    I use var substitution I still have to rwrite this subtititiov var for  each table definition.
                 What I am looking for is to put these info in say one file and  pass the location, in some way, to
    hive cli
                 so hive elasticsearch will get these settings when trying to find es server to talk to.
                 I am not looking into put these info into files like  hive-site.xml.

                 Thanks,

                 Jack

                 --
                 You received this message because you are subscribed to the Google Groups "elasticsearch" group.
                 To unsubscribe from this group and stop receiving emails from it, send an email to
                 elasticsearch+unsubscribe@__go__oglegroups.com <http://googlegroups.com>
    <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>
                 <mailto:elasticsearch+____unsubscribe@googlegroups.com
    <mailto:elasticsearch%2B__unsubscribe@googlegroups.com> <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>>.

                 To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com>
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>>

    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com?utm___medium=__email&utm_source=__footer
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer>

    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com?utm_medium=email&utm_source=footer>>>.
                 For more options, visit https://groups.google.com/d/____optout
    <https://groups.google.com/d/__optout> <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>.



             --
             Costin

             --
             You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
             To unsubscribe from this topic, visit
    https://groups.google.com/d/____topic/elasticsearch/____1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe>
             <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>>.
             To unsubscribe from this group and all its topics, send an email to
    elasticsearch+unsubscribe@__go__oglegroups.com <http://googlegroups.com>
             <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>.

             To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/539D6507.____3080207%40gmail.com
    <https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com>
             <https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/539D6507.3080207%40gmail.com>>.
             For more options, visit https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>.





         --
         -- Jinyuan (Jack) Zhou


    --
    You received this message because you are subscribed to the Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to
    elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
    To view this discussion on the web visit
    https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com>
    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com?utm___medium=email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com?utm_medium=email&utm_source=footer>__>.

    For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.


--
Costin

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@__googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/539F5C5F.__5050408%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539F5C5F.5050408%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53A041B6.3010203%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

I will check the value. However, it has problem only when I use both
es.mapping.id and 'dynamic/mult resource wirtes' feature. used separately
they are fine.

Jinyuan (Jack) Zhou

On Tue, Jun 17, 2014 at 6:25 AM, Costin Leau costin.leau@gmail.com wrote:

Most likely the some of your data contains some invalid entries which
result in an invalid JSON payload being sent to ES.
Check your ID values and/or keep an eye on issue #217 which aims to
provide more human-friendly messages for the user.

Cheers.

Add friendlier diagnostics to EsHadoopInvalidRequest · Issue #217 · elastic/elasticsearch-hadoop · GitHub

On 6/17/14 2:42 AM, Jinyuan Zhou wrote:

sure, I was able to run follwoing command against my remote es cluster.
hive -i init.hive -f search.hql.

Below is the contents of init.hive, search.hql and data file in hdfs
/user/cloudera/hivework/foobar/foobar.data

I replaced value for es.nodes with fake name. Other than that, it should
ran without problem. I am using feature called
'dynamic/mult resource wirtes. It works in this example, but when I also
add 'es.mapping.id http://es.mapping.id' =

'id' setting. I got a the following error:
/
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:
Unexpected character ('"' (code 34)): was expecting
comma to separate OBJECT entries
at [Source: [B@7be1d686; line: 1, column: 53]
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.
java:300)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.
java:278)/

-----init.hive----

set es.nodes=my.remote.escluster;
set es.port=9200;
set es.index.auto.create=yes;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
set mapred.map.tasks.speculative.execution=false;
set mapred.reduce.tasks.speculative.execution=false;
set hive.mapred.reduce.tasks.speculative.execution=false;
add jar /home/cloudera/elasticsearch-hadoop-2.0.0/dist/
elasticsearch-hadoop-hive-2.0.0.jar;

-----search.hql----

use search;
DROP TABLE IF EXISTS foo;
CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/hivework/foobar';
select * from foo;
DROP TABLE IF EXISTS es_foo;
CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');

INSERT OVERWRITE TABLE es_foo SELECT * FROM foo;

----- /user/cloudera/hivework/foobar/foobar.data ---

1, bar1, first_bar
2, bar2, first_bar
3, foo_bar_1, second_bar
4, foo_bar_12, second_bar
~

Jinyuan (Jack) Zhou

On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau <costin.leau@gmail.com
mailto:costin.leau@gmail.com> wrote:

Thanks for sharing - can you also give an example of the table

initialization in init.hive vs myscript.hql?

Cheers!


On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

    Just share a solution  I learned  hive side.

    hive cli has an -i option that takes a  file of hive commands to

initilize the session.
so I can put a list of set comand as well as add jar ... command
in one file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql.
Note table creation hql inside myscript.hql don't
have to
set es.* properties as long as it appears in init.hive file This
solves my problem.
Thanks,

    Jinyuan (Jack) Zhou


    On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <

zhou.jinyuan@gmail.com mailto:zhou.jinyuan@gmail.com
<mailto:zhou.jinyuan@gmail.com mailto:zhou.jinyuan@gmail.com__>>
wrote:

         Thanks Costin,
         I am aiming at modifying  the existing hadoop cluster and

hive installation and also modularizing some
common es.*
properies in a separate common place. I know the first goal
can be achieved with hive cli --auxpath
option and
hive table's TBLPROPERTERTIES. For the secon goal, I am able
to move some es.* settings from TBLPROPERTIES
declaration to hive's set statments. For example, I can put

             set es.nodes=my.domain.com <http://my.domain.com> <

http://my.domain.com>

         in the same hql file  then skip es.nodes setting in

TBLPROPERTIES in the external table delcarations in the
SAME
hql. But I wish I can move the set statetemnt in a separate
file. I now realize this is rather a hive
question.
Regards,
Jack

         On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <

costin.leau@gmail.com mailto:costin.leau@gmail.com
<mailto:costin.leau@gmail.com mailto:costin.leau@gmail.com>__>
wrote:

             Could you please raise an issue with some type of

example? Due to the way Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

             The configuration needs to be created before a job is

submitted which in practice means "dynamic
configurations"
are basically impossible (this also has some security
implications which are simply avoided this way).
Thus either one specifies the configuration manually or
loads a known location file (hive-site.xml,
core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading,
etc... unless one adds a pre-processor to the job
content
(script, flow, etc...)
by the time es-hadoop kicks in, the job is already
running and thus its changes discarded.

             Cheers,

             On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

                 Hi,
                 I am playing with elasticsearch and hive

integration. The documentation says
to set configuration like es.nodes, es.port in
TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten
data set to index to the same es cluster,
I would have to repeat this information ten times
in TBLPROPERTIES. Even if
I use var substitution I still have to rwrite
this subtititiov var for each table definition.
What I am looking for is to put these info in say
one file and pass the location, in some way, to
hive cli
so hive elasticsearch will get these settings when
trying to find es server to talk to.
I am not looking into put these info into files like
hive-site.xml.

                 Thanks,

                 Jack

                 --
                 You received this message because you are subscribed

to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving
emails from it, send an email to
elasticsearch+unsubscribe@go__oglegroups.com <
http://googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>
>
<mailto:elasticsearch+
__
unsubscribe@googlegroups.com
mailto:elasticsearch%2B__unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com
>>.

                 To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/

7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com>
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>>

    <https://groups.google.com/d/____msgid/elasticsearch/

7040c805-____e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=__footer
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer>

    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-

__e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>>.
For more options, visit
https://groups.google.com/d/____optout
https://groups.google.com/d/__optout <
https://groups.google.com/d/__optout <https://groups.google.com/d/optout

.

             --
             Costin

             --
             You received this message because you are subscribed to

a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/____topic/elasticsearch/____
1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/__topic/elasticsearch/__
1WH7kOD3uKs/unsubscribe>

             <https://groups.google.com/d/__topic/elasticsearch/__

1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
1WH7kOD3uKs/unsubscribe>>.
To unsubscribe from this group and all its topics, send
an email to
elasticsearch+unsubscribe@__go__oglegroups.com <
http://googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com
>.

             To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/

539D6507.____3080207%40gmail.com
<https://groups.google.com/d/__msgid/elasticsearch/539D6507.
__3080207%40gmail.com>

             <https://groups.google.com/d/_

_msgid/elasticsearch/539D6507.__3080207%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539D6507.
3080207%40gmail.com>>.
For more options, visit https://groups.google.com/d/__
__optout https://groups.google.com/d/__optout

    <https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>>.

         --
         -- Jinyuan (Jack) Zhou


    --
    You received this message because you are subscribed to the

Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/__
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com>
<https://groups.google.com/d/__msgid/elasticsearch/__
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com?utm___medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>.

    For more options, visit https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to a topic in

the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__
topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
1WH7kOD3uKs/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@__googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/539F5C5F._
_5050408%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539F5C5F.
5050408%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.
gmail.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/53A041B6.3010203%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCFG9qNFit5Km%2BV-ierqKqPrd1_x6Vc9wGvBCDr%2BW2ozxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I have confirmed with both elasticsearch hive and easticsearcg mr, If both
below situation happens, , EsOutFormat produces invalid header for bulk
indexing.

  1. es.resouce contains data to be extracted from doucment
  2. es.mapping.id set to be one of field sin document

I looked at the code and invalid header json. It is missing a "," between
"_index": "???", "_type":"???" and rest of interval field. I believe the
following code inside AbstractBulkFactory.java is responsible. I am using
elasticsearch hadoop 2.0

protected void writeBeforeObject(List pieces) { startHeader(pieces);
index(pieces); id(pieces); parent(pieces); routing(pieces); ttl(pieces);
version(pieces); timestamp(pieces); otherHeader(pieces); endHeader(pieces);
scriptParams(pieces); }
Thanks,
Jack

Jinyuan (Jack) Zhou

On Tue, Jun 17, 2014 at 6:25 AM, Costin Leau costin.leau@gmail.com wrote:

Most likely the some of your data contains some invalid entries which
result in an invalid JSON payload being sent to ES.
Check your ID values and/or keep an eye on issue #217 which aims to
provide more human-friendly messages for the user.

Cheers.

Add friendlier diagnostics to EsHadoopInvalidRequest · Issue #217 · elastic/elasticsearch-hadoop · GitHub

On 6/17/14 2:42 AM, Jinyuan Zhou wrote:

sure, I was able to run follwoing command against my remote es cluster.
hive -i init.hive -f search.hql.

Below is the contents of init.hive, search.hql and data file in hdfs
/user/cloudera/hivework/foobar/foobar.data

I replaced value for es.nodes with fake name. Other than that, it should
ran without problem. I am using feature called
'dynamic/mult resource wirtes. It works in this example, but when I also
add 'es.mapping.id http://es.mapping.id' =
'id' setting. I got a the following error:
/
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:
Unexpected character ('"' (code 34)): was expecting
comma to separate OBJECT entries
at [Source: [B@7be1d686; line: 1, column: 53]
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.
java:300)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.
java:278)/

-----init.hive----

set es.nodes=my.remote.escluster;
set es.port=9200;
set es.index.auto.create=yes;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
set mapred.map.tasks.speculative.execution=false;
set mapred.reduce.tasks.speculative.execution=false;
set hive.mapred.reduce.tasks.speculative.execution=false;
add jar /home/cloudera/elasticsearch-hadoop-2.0.0/dist/
elasticsearch-hadoop-hive-2.0.0.jar;

-----search.hql----

use search;
DROP TABLE IF EXISTS foo;
CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/hivework/foobar';
select * from foo;
DROP TABLE IF EXISTS es_foo;
CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');

INSERT OVERWRITE TABLE es_foo SELECT * FROM foo;

----- /user/cloudera/hivework/foobar/foobar.data ---

1, bar1, first_bar
2, bar2, first_bar
3, foo_bar_1, second_bar
4, foo_bar_12, second_bar
~

Jinyuan (Jack) Zhou

On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau <costin.leau@gmail.com
mailto:costin.leau@gmail.com> wrote:

Thanks for sharing - can you also give an example of the table

initialization in init.hive vs myscript.hql?

Cheers!


On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

    Just share a solution  I learned  hive side.

    hive cli has an -i option that takes a  file of hive commands to

initilize the session.
so I can put a list of set comand as well as add jar ... command
in one file, say inithive
then run the cli as this: hive -i init.hive -f myscript.hql.
Note table creation hql inside myscript.hql don't
have to
set es.* properties as long as it appears in init.hive file This
solves my problem.
Thanks,

    Jinyuan (Jack) Zhou


    On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <

zhou.jinyuan@gmail.com mailto:zhou.jinyuan@gmail.com
<mailto:zhou.jinyuan@gmail.com mailto:zhou.jinyuan@gmail.com__>>
wrote:

         Thanks Costin,
         I am aiming at modifying  the existing hadoop cluster and

hive installation and also modularizing some
common es.*
properies in a separate common place. I know the first goal
can be achieved with hive cli --auxpath
option and
hive table's TBLPROPERTERTIES. For the secon goal, I am able
to move some es.* settings from TBLPROPERTIES
declaration to hive's set statments. For example, I can put

             set es.nodes=my.domain.com <http://my.domain.com> <

http://my.domain.com>

         in the same hql file  then skip es.nodes setting in

TBLPROPERTIES in the external table delcarations in the
SAME
hql. But I wish I can move the set statetemnt in a separate
file. I now realize this is rather a hive
question.
Regards,
Jack

         On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <

costin.leau@gmail.com mailto:costin.leau@gmail.com
<mailto:costin.leau@gmail.com mailto:costin.leau@gmail.com>__>
wrote:

             Could you please raise an issue with some type of

example? Due to the way Hadoop (and Hive) works,
things tend to be tricky in terms of configuring a job.

             The configuration needs to be created before a job is

submitted which in practice means "dynamic
configurations"
are basically impossible (this also has some security
implications which are simply avoided this way).
Thus either one specifies the configuration manually or
loads a known location file (hive-site.xml,
core-site.xml...)
upfront, before the job is submitted.
This means when dealing with Hive, Pig, Cascading,
etc... unless one adds a pre-processor to the job
content
(script, flow, etc...)
by the time es-hadoop kicks in, the job is already
running and thus its changes discarded.

             Cheers,

             On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

                 Hi,
                 I am playing with elasticsearch and hive

integration. The documentation says
to set configuration like es.nodes, es.port in
TBLPROPERTIES. It works.
But it can cause many reduntant codes. If I have ten
data set to index to the same es cluster,
I would have to repeat this information ten times
in TBLPROPERTIES. Even if
I use var substitution I still have to rwrite
this subtititiov var for each table definition.
What I am looking for is to put these info in say
one file and pass the location, in some way, to
hive cli
so hive elasticsearch will get these settings when
trying to find es server to talk to.
I am not looking into put these info into files
like hive-site.xml.

                 Thanks,

                 Jack

                 --
                 You received this message because you are subscribed

to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving
emails from it, send an email to
elasticsearch+unsubscribe@go__oglegroups.com <
http://googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>
>
<mailto:elasticsearch+
__
unsubscribe@googlegroups.com
mailto:elasticsearch%2B__unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com
>>.

                 To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/

7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com>
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>>

    <https://groups.google.com/d/____msgid/elasticsearch/

7040c805-____e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=__footer
<https://groups.google.com/d/__msgid/elasticsearch/7040c805-
__e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer>

    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-

__e845-4b3d-a9fe-5e18d8445f7f%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/7040c805-
e845-4b3d-a9fe-5e18d8445f7f%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>>.
For more options, visit
https://groups.google.com/d/____optout
https://groups.google.com/d/__optout <
https://groups.google.com/d/__optout <https://groups.google.com/d/optout

.

             --
             Costin

             --
             You received this message because you are subscribed to

a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/____topic/elasticsearch/____
1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/__topic/elasticsearch/__
1WH7kOD3uKs/unsubscribe>
<https://groups.google.com/d/__topic/elasticsearch/__
1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
1WH7kOD3uKs/unsubscribe>>.
To unsubscribe from this group and all its topics, send
an email to
elasticsearch+unsubscribe@__go__oglegroups.com <
http://googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com
>.

             To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/

539D6507.____3080207%40gmail.com
<https://groups.google.com/d/__msgid/elasticsearch/539D6507.
__3080207%40gmail.com>
<https://groups.google.com/d/_
_msgid/elasticsearch/539D6507.__3080207%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539D6507.
3080207%40gmail.com>>.
For more options, visit https://groups.google.com/d/__
__optout https://groups.google.com/d/__optout
<https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>>.

         --
         -- Jinyuan (Jack) Zhou


    --
    You received this message because you are subscribed to the

Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/__
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com>
<https://groups.google.com/d/__msgid/elasticsearch/__
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com?utm___medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%
3D9vjaDQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>.

    For more options, visit https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to a topic in

the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__
topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
1WH7kOD3uKs/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@__googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/539F5C5F._
_5050408%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/539F5C5F.
5050408%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.
gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/53A041B6.3010203%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANBTPCHuJ3CwVMiB-2GFC790st3_CVkmzA5kHd2u%2Bsmax1Z9fw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Please upgrade to version 2.0.1

On 9/17/14 1:18 AM, Jinyuan Zhou wrote:

I have confirmed with both elasticsearch hive and easticsearcg mr, If both below situation happens, , EsOutFormat
produces invalid header for bulk indexing.

  1. es.resouce contains data to be extracted from doucment
  2. es.mapping.id http://es.mapping.id set to be one of field sin document

I looked at the code and invalid header json. It is missing a "," between "_index": "???", "_type":"???" and rest of
interval field. I believe the following code inside AbstractBulkFactory.java is responsible. I am using elasticsearch
hadoop 2.0

protected void writeBeforeObject(List pieces) {
startHeader(pieces);

index(pieces);

id(pieces);
parent(pieces);
routing(pieces);
ttl(pieces);
version(pieces);
timestamp(pieces);

otherHeader(pieces);
endHeader(pieces);

scriptParams(pieces);
}

Thanks,
Jack

Jinyuan (Jack) Zhou

On Tue, Jun 17, 2014 at 6:25 AM, Costin Leau <costin.leau@gmail.com mailto:costin.leau@gmail.com> wrote:

Most likely the some of your data contains some invalid entries which result in an invalid JSON payload being sent
to ES.
Check your ID values and/or keep an eye on issue #217 which aims to provide more human-friendly messages for the user.

Cheers.

https://github.com/__elasticsearch/elasticsearch-__hadoop/issues/217
<https://github.com/elasticsearch/elasticsearch-hadoop/issues/217>

On 6/17/14 2:42 AM, Jinyuan Zhou wrote:

    sure, I was able to run  follwoing command against my remote es cluster.
    hive -i init.hive -f search.hql.

    Below is the contents of init.hive, search.hql and data file in hdfs /user/cloudera/hivework/__foobar/foobar.data

    I replaced value for es.nodes with fake name. Other than that,  it should ran without problem. I am using
    feature called
    'dynamic/mult resource wirtes. It works in this example, but when I also add 'es.mapping.id
    <http://es.mapping.id> <http://es.mapping.id>' =
    'id' setting. I got a the following error:
    /
    Caused by: org.elasticsearch.hadoop.rest.__EsHadoopInvalidRequest: Unexpected character ('"' (code 34)): was
    expecting
    comma to separate OBJECT entries
       at [Source: [B@7be1d686; line: 1, column: 53]
              at org.elasticsearch.hadoop.rest.__RestClient.execute(RestClient.__java:300)
              at org.elasticsearch.hadoop.rest.__RestClient.execute(RestClient.__java:278)/



    -----init.hive----

    set es.nodes=my.remote.escluster;
    set es.port=9200;
    set es.index.auto.create=yes;
    set hive.cli.print.current.db=__true;
    set hive.exec.mode.local.auto=__true;
    set mapred.map.tasks.speculative.__execution=false;
    set mapred.reduce.tasks.__speculative.execution=false;
    set hive.mapred.reduce.tasks.__speculative.execution=false;
    add jar /home/cloudera/elasticsearch-__hadoop-2.0.0/dist/__elasticsearch-hadoop-hive-2.0.__0.jar;

    -----search.hql----

    use search;
    DROP TABLE IF EXISTS foo;
    CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
    LOCATION '/user/cloudera/hivework/__foobar';
    select * from foo;
    DROP TABLE IF EXISTS es_foo;
    CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
    STORED BY 'org.elasticsearch.hadoop.__hive.EsStorageHandler'
    TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');

    INSERT OVERWRITE TABLE es_foo  SELECT * FROM foo;

    ----- /user/cloudera/hivework/__foobar/foobar.data ---

    1, bar1, first_bar
    2, bar2, first_bar
    3, foo_bar_1, second_bar
    4, foo_bar_12, second_bar
    ~




    Jinyuan (Jack) Zhou


    On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>
    <mailto:costin.leau@gmail.com <mailto:costin.leau@gmail.com>>__> wrote:

         Thanks for sharing - can you also give an example of the table initialization in init.hive vs myscript.hql?

         Cheers!


         On 6/16/14 11:19 PM, Jinyuan Zhou wrote:

             Just share a solution  I learned  hive side.

             hive cli has an -i option that takes a  file of hive commands to initilize the session.
             so I can put a list of set comand as well as add jar ... command in one file, say inithive
             then run the cli as this:  hive -i init.hive -f myscript.hql.  Note table creation hql inside
    myscript.hql don't
             have to
             set es.* properties as long as it appears in init.hive file  This solves my problem.
             Thanks,


             Jinyuan (Jack) Zhou


             On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <zhou.jinyuan@gmail.com <mailto:zhou.jinyuan@gmail.com>
    <mailto:zhou.jinyuan@gmail.com <mailto:zhou.jinyuan@gmail.com>__>
             <mailto:zhou.jinyuan@gmail.com <mailto:zhou.jinyuan@gmail.com> <mailto:zhou.jinyuan@gmail.com
    <mailto:zhou.jinyuan@gmail.com>__>__>> wrote:

                  Thanks Costin,
                  I am aiming at modifying  the existing hadoop cluster and hive installation and also modularizing
      some
             common es.*
                  properies in a separate common place.  I know the first goal can be achieved with hive cli  --auxpath
             option  and
                  hive table's TBLPROPERTERTIES. For the secon goal, I am able to move  some es.* settings from
    TBLPROPERTIES
                  declaration to hive's set statments. For example, I can put

                      set es.nodes=my.domain.com <http://my.domain.com> <http://my.domain.com> <http://my.domain.com>


                  in the same hql file  then skip es.nodes setting in TBLPROPERTIES in the external table
    delcarations in the
             SAME
                  hql. But I wish  I can move the set statetemnt in a separate file. I now realize this is rather a
    hive
             question.
                  Regards,
                  Jack


                  On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>
    <mailto:costin.leau@gmail.com <mailto:costin.leau@gmail.com>>
             <mailto:costin.leau@gmail.com <mailto:costin.leau@gmail.com> <mailto:costin.leau@gmail.com
    <mailto:costin.leau@gmail.com>>__>__> wrote:

                      Could you please raise an issue with some type of example? Due to the way Hadoop (and Hive) works,
                      things tend to be tricky in terms of configuring a job.

                      The configuration needs to be created before a job is submitted which in practice means "dynamic
             configurations"
                      are basically impossible (this also has some security implications which are simply avoided
    this way).
                      Thus either one specifies the configuration manually or loads a known location file
    (hive-site.xml,
                      core-site.xml...)
                      upfront, before the job is submitted.
                      This means when dealing with Hive, Pig, Cascading, etc... unless one adds a pre-processor to
    the job
             content
                      (script, flow, etc...)
                      by the time es-hadoop kicks in, the job is already running and thus its changes discarded.

                      Cheers,

                      On 6/14/14 1:57 AM, Jinyuan Zhou wrote:

                          Hi,
                          I am playing with elasticsearch and hive integration. The documentation says
                          to set configuration like es.nodes, es.port  in TBLPROPERTIES. It works.
                          But it can cause many reduntant codes. If I have ten data set to index to the same es cluster,
                             I would have to repeat this information ten times in TBLPROPERTIES. Even if
                             I use var substitution I still have to rwrite this subtititiov var for  each table
    definition.
                          What I am looking for is to put these info in say one file and  pass the location, in some
    way, to
             hive cli
                          so hive elasticsearch will get these settings when trying to find es server to talk to.
                          I am not looking into put these info into files like  hive-site.xml.

                          Thanks,

                          Jack

                          --
                          You received this message because you are subscribed to the Google Groups "elasticsearch"
    group.
                          To unsubscribe from this group and stop receiving emails from it, send an email to
                          elasticsearch+unsubscribe@__go____oglegroups.com <http://go__oglegroups.com>
    <http://googlegroups.com>
             <mailto:elasticsearch%____2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%25__2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch%__252Bunsubscribe@googlegroups.__com
    <mailto:elasticsearch%25252Bunsubscribe@googlegroups.com>>__>
                          <mailto:elasticsearch+______unsubscribe@googlegroups.com
    <mailto:elasticsearch%2B____unsubscribe@googlegroups.com>
             <mailto:elasticsearch%2B____unsubscribe@googlegroups.com
    <mailto:elasticsearch%252B__unsubscribe@googlegroups.com>>
    <mailto:elasticsearch%____2Bunsubscribe@googlegroups.com <mailto:elasticsearch%25__2Bunsubscribe@googlegroups.com>
             <mailto:elasticsearch%__252Bunsubscribe@googlegroups.__com
    <mailto:elasticsearch%25252Bunsubscribe@googlegroups.com>>__>>.

                          To view this discussion on the web visit
    https://groups.google.com/d/______msgid/elasticsearch/__7040c805-____e845-4b3d-a9fe-__5e18d8445f7f%______40googlegroups.com
    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com>

    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com>>

    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com>

    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>>>


    <https://groups.google.com/d/______msgid/elasticsearch/__7040c805-____e845-4b3d-a9fe-__5e18d8445f7f%______40googlegroups.com?utm_____medium=__email&utm_source=____footer
    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com?utm___medium=__email&utm_source=__footer>

    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com?utm___medium=__email&utm_source=__footer
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer>>


    <https://groups.google.com/d/____msgid/elasticsearch/7040c805-____e845-4b3d-a9fe-5e18d8445f7f%____40googlegroups.com?utm___medium=__email&utm_source=__footer
    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer>

    <https://groups.google.com/d/__msgid/elasticsearch/7040c805-__e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/7040c805-e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com?utm_medium=email&utm_source=footer>>>>.
                          For more options, visit https://groups.google.com/d/______optout
    <https://groups.google.com/d/____optout>
             <https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>>
    <https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>>.



                      --
                      Costin

                      --
                      You received this message because you are subscribed to a topic in the Google Groups
    "elasticsearch" group.
                      To unsubscribe from this topic, visit
    https://groups.google.com/d/______topic/elasticsearch/______1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/____topic/elasticsearch/____1WH7kOD3uKs/unsubscribe>
             <https://groups.google.com/d/____topic/elasticsearch/____1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe>>
                      <https://groups.google.com/d/____topic/elasticsearch/____1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe>
             <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>>>.
                      To unsubscribe from this group and all its topics, send an email to
             elasticsearch+unsubscribe@__go____oglegroups.com <http://go__oglegroups.com> <http://googlegroups.com>
                      <mailto:elasticsearch%____2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%25__2Bunsubscribe@googlegroups.com>
             <mailto:elasticsearch%__252Bunsubscribe@googlegroups.__com
    <mailto:elasticsearch%25252Bunsubscribe@googlegroups.com>>__>.

                      To view this discussion on the web visit
    https://groups.google.com/d/______msgid/elasticsearch/__539D6507.____3080207%40gmail.__com
    <https://groups.google.com/d/____msgid/elasticsearch/539D6507.____3080207%40gmail.com>
             <https://groups.google.com/d/____msgid/elasticsearch/539D6507.____3080207%40gmail.com
    <https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com>>
                      <https://groups.google.com/d/____msgid/elasticsearch/539D6507.____3080207%40gmail.com
    <https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com>
             <https://groups.google.com/d/__msgid/elasticsearch/539D6507.__3080207%40gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/539D6507.3080207%40gmail.com>>>.
                      For more options, visit https://groups.google.com/d/______optout
    <https://groups.google.com/d/____optout> <https://groups.google.com/d/____optout
    <https://groups.google.com/d/__optout>>
             <https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>>.





                  --
                  -- Jinyuan (Jack) Zhou


             --
             You received this message because you are subscribed to the Google Groups "elasticsearch" group.
             To unsubscribe from this group and stop receiving emails from it, send an email to
             elasticsearch+unsubscribe@__go__oglegroups.com <http://googlegroups.com>
    <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>
             <mailto:elasticsearch+____unsubscribe@googlegroups.com
    <mailto:elasticsearch%2B__unsubscribe@googlegroups.com> <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>>.
             To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/____CANBTPCErh1M5_xNa0SE-____ZShpUDuXKTPMCYqrWCB1z36%____3D9vjaDQ%40mail.gmail.com
    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com>

    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com>>

    <https://groups.google.com/d/____msgid/elasticsearch/____CANBTPCErh1M5_xNa0SE-____ZShpUDuXKTPMCYqrWCB1z36%____3D9vjaDQ%40mail.gmail.com?utm_____medium=email&utm_source=__footer
    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com?utm___medium=email&utm_source=footer>

    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCErh1M5_xNa0SE-__ZShpUDuXKTPMCYqrWCB1z36%__3D9vjaDQ%40mail.gmail.com?utm___medium=email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com?utm_medium=email&utm_source=footer>__>__>.

             For more options, visit https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>.


         --
         Costin

         --
         You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
         To unsubscribe from this topic, visit
    https://groups.google.com/d/____topic/elasticsearch/____1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe>
         <https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
    <https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>>.
         To unsubscribe from this group and all its topics, send an email to
    elasticsearch+unsubscribe@__go__oglegroups.com <http://googlegroups.com>
         <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>.
         To view this discussion on the web visit
    https://groups.google.com/d/____msgid/elasticsearch/539F5C5F.____5050408%40gmail.com
    <https://groups.google.com/d/__msgid/elasticsearch/539F5C5F.__5050408%40gmail.com>
         <https://groups.google.com/d/__msgid/elasticsearch/539F5C5F.__5050408%40gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/539F5C5F.5050408%40gmail.com>>.

         For more options, visit https://groups.google.com/d/____optout <https://groups.google.com/d/__optout>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>.


    --
    You received this message because you are subscribed to the Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to
    elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
    To view this discussion on the web visit
    https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCGhqWTJLAWNKmnkMTOWGFizi__4wShfvo7V0u0_5HDniDkg%40mail.__gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com>
    <https://groups.google.com/d/__msgid/elasticsearch/__CANBTPCGhqWTJLAWNKmnkMTOWGFizi__4wShfvo7V0u0_5HDniDkg%40mail.__gmail.com?utm_medium=email&__utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com?utm_medium=email&utm_source=footer>>.
    For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.


--
Costin

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__topic/elasticsearch/__1WH7kOD3uKs/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/1WH7kOD3uKs/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@__googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/53A041B6.__3010203%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/53A041B6.3010203%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANBTPCHuJ3CwVMiB-2GFC790st3_CVkmzA5kHd2u%2Bsmax1Z9fw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANBTPCHuJ3CwVMiB-2GFC790st3_CVkmzA5kHd2u%2Bsmax1Z9fw%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5419129A.3040909%40gmail.com.
For more options, visit https://groups.google.com/d/optout.