Hi all,
I have a PIG table hosting tomcat logs with 3 fields : log_date, log_url,
log_nb.
I want to store this in ES with an index on log_url, each document having
an array of nested maps for each day :
{ "log_url": "http://www.xxx.fr/index.html",
"log_hits": [
{
"log_nb": 1,
"log_date": "20150406"
} ,
{
"log_nb": 2,
"log_date": "20150407"
}
]
}
This script will be run everyday, generating a new entry for each url. So
for a given log_url, the array will grow 1 element each day
as stated in the es-hadoop documentation
http://www.elastic.co/guide/en/elasticsearch/hadoop/current/pig.html#tuple-names,
if we set es.mapping.pig.tuple.use.field.names (by default false) to true,
tuples will be considered as array of maps when storing into ES.
The PIG code looks like :
b = LOAD ......
c = group b BY log_url;
d = FOREACH c GENERATE
group AS log_url,
TOTUPLE (log_date, log_nb) AS log_hits;
store d into 'myindex/myindex'
using org.elasticsearch.hadoop.pig.EsStorage (
'es.mapping.pig.tuple.use.field.names=true',
'es.write.operation=upsert',
'es.mapping.id=log_url',
);
First time I launch it, it creates the following record :
{ "log_url": "http://www.xxx.fr/index.html",
"log_hits": [
{
"log_nb": 1,
"log_date": "20150406"
}
]
}
so far, so good,
when run again with a new date (say "20150407"), instead of inserting a new
entry in the embedded array "log_hits", it will replace its single array
element and the ES document will become :
{ "log_url": "http://www.xxx.fr/index.html",
"log_hits": [
{
"log_nb": 2,
"log_date": "20150407"
}
]
}
I was expecting to get
{ "log_url": "http://www.xxx.fr/index.html",
"log_hits": [
{
"log_nb": 1,
"log_date": "20150406"
} ,
{
"log_nb": 2,
"log_date": "20150407"
}
]
}
Is there a way to achieve that ?
thanks
Philippe
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f62ee6fb-9efc-4fd8-82f7-a3b6cf594758%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.