I'm interested in using the EsOutputFormat class in a hadoop mapreduce
task.
During experimentation I have noticed that there is no direct handling for
'date' objects.
My data contains a number of 'date' fields which must be transposed into
the Elasticsearch index, however, I am currently unable to successfully
transpose those fields which should be of type 'date' as instead they are
simple submitted into the index as 'string' type.
Using templates, I have tried to define a dynamic_date_formats as well as
explicitly specifying a date type and format mapping for a matched field in
a dynamic template which matches against the name of those fields which
should be 'date' types.
In either case, data fields indexed into my Elasticsearch cluster which
should be recognized as 'date' types are only set as strings .
Here is an example template similar to that with which I have been
experimenting.
Make sure the template does match. This might not be always obvious however it's easy to test out. First, check your
template and after defining the template, send a request with a sample payload to see whether the doc gets properly
created. A common mistake is defining the template after the index is created which makes it useless; the template gets
applied when a the index is created (and thus it becomes part of its mapping).
Second, if the mapping appears correct, double-check your es-hadoop configuration and potentially turn on logging to see
the payload sent by es-hadoop to elasticsearch.
Hope this helps,
On 7/1/14 11:09 PM, Telax wrote:
Hello,
I'm interested in using the EsOutputFormat class in a hadoop mapreduce task.
During experimentation I have noticed that there is no direct handling for 'date' objects.
My data contains a number of 'date' fields which must be transposed into the Elasticsearch index, however, I am
currently unable to successfully transpose those fields which should be of type 'date' as instead they are simple
submitted into the index as 'string' type.
Using templates, I have tried to define a dynamic_date_formats as well as explicitly specifying a date type and format
mapping for a matched field in a dynamic template which matches against the name of those fields which should be 'date'
types.
In either case, data fields indexed into my Elasticsearch cluster which should be recognized as 'date' types are only
set as strings .
Here is an example template similar to that with which I have been experimenting.
{
"template" : "index-name-",
"mappings" : {
"default" : {
"dynamic_date_formats" : ["yyyy-MM-dd hh:mm"]
"dynamic_templates" : [
{ "date_field_template": {
"match": "date_",
"mapping": {
"type": "date",
"format" : ""yyyy-MM-dd hh:mm""
}
}
}
}
Any help on this issue would be greatly appreciated.
Thanks
My issue actually came down to the ordering of my matches. I had a
'match:*' as the first dynamic template which disabled norms. Although this
template didn't explicitly define a type for any matched field it would
automatically set the 'date' field to a string type. The "date_" template
would then match but fail to set the type to date as it had already been
defined. Simply reordering my dynamic templates so that the date matcher
came before the catch all solved the issue
On 10 Jul 2014 22:52, "Costin Leau" costin.leau@gmail.com wrote:
Make sure the template does match. This might not be always obvious
however it's easy to test out. First, check your template and after
defining the template, send a request with a sample payload to see whether
the doc gets properly created. A common mistake is defining the template
after the index is created which makes it useless; the template gets
applied when a the index is created (and thus it becomes part of its
mapping).
Second, if the mapping appears correct, double-check your es-hadoop
configuration and potentially turn on logging to see the payload sent by
es-hadoop to elasticsearch.
Hope this helps,
On 7/1/14 11:09 PM, Telax wrote:
Hello,
I'm interested in using the EsOutputFormat class in a hadoop mapreduce
task.
During experimentation I have noticed that there is no direct handling
for 'date' objects.
My data contains a number of 'date' fields which must be transposed into
the Elasticsearch index, however, I am
currently unable to successfully transpose those fields which should be
of type 'date' as instead they are simple
submitted into the index as 'string' type.
Using templates, I have tried to define a dynamic_date_formats as well as
explicitly specifying a date type and format
mapping for a matched field in a dynamic template which matches against
the name of those fields which should be 'date'
types.
In either case, data fields indexed into my Elasticsearch cluster which
should be recognized as 'date' types are only
set as strings .
Here is an example template similar to that with which I have been
experimenting.
{
"template" : "index-name-",
"mappings" : {
"default" : {
"dynamic_date_formats" : ["yyyy-MM-dd hh:mm"]
"dynamic_templates" : [
{ "date_field_template": {
"match": "date_",
"mapping": {
"type": "date",
"format" : ""yyyy-MM-dd hh:mm""
}
}
}
}
Any help on this issue would be greatly appreciated.
Thanks
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.