[hadoop] Multiple indexes setting for 'es.resource'

Hi,

I have a problem with 'es.resource' configuration for including multiple
indexes.

The hive table I created is like below

CREATE EXTERNAL TABLE test
(
date timestamp,
clientip string,
request string
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES
(
'es.resource' = 'apache-2014.09.29/apache-access',
-- or
-- 'es.resource' = 'apache-2014.09.30/apache-access',
'es.mapping.names' = 'date:@timestamp https://github.com/timestamp'
);

and I used 'select count() from test;' which is a hive query to count the
total number of rows of the table.
the result is same with ES count.
the count result are 1454536 and 215564 for each apache-2014.09.29 and
apache-2014.09.30 index
then, I changed 'es.resource' = 'apache-2014.09.29/apache-access' to
'es.resource' = 'apache-2014.09.
/apache-access' or
'es.resource' = 'apache-2014.09.29,apache-2014.09.30/apache-access'
for including multiple indexes.
and I used 'select count(*) from test;' again to count the total number of
documents of the indexes,
but the result is different with ES count.
the count result is 2919161 which should be 1670100 (1454536 + 215564).

any help?

environmental information

  • centos base 6.4 64-bit / java version "1.7.0_55"
  • CDH-5.1.2-1.cdh5.1.2.p0.3
  • hive 0.12.0
  • elasticsearch-hadoop-2.0.1
  • 3 nodes' hadoop and es cluster

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12e1d78b-9f6f-491a-87d8-5249c45b9812%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.