I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems and
I can see all the data in ES.
However, I get the following error when i create the index manually.
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)
To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.
The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.
Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.
Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)
Likely the issue is caused by the fact that in your manual mapping, the
"NULL" value is not actually mapped to null but actually to a string value.
You should be able to get around it by converting "NULL" to a proper NULL
value which es-hadoop can recognized; additionally you can 'translate' it
to a default one.
As for understanding what field caused the exception, unfortunately
Elasticsearch doesn't provide enough information about this yet but it
should. Can you please raise a quick issue on es-hadoop about this?
I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems
and I can see all the data in ES.
However, I get the following error when i create the index manually.
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)
To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.
The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.
Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.
Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)
Likely the issue is caused by the fact that in your manual mapping, the
"NULL" value is not actually mapped to null but actually to a string value.
You should be able to get around it by converting "NULL" to a proper NULL
value which es-hadoop can recognized; additionally you can 'translate' it
to a default one.
As for understanding what field caused the exception, unfortunately
Elasticsearch doesn't provide enough information about this yet but it
should. Can you please raise a quick issue on es-hadoop about this?
I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems
and I can see all the data in ES.
However, I get the following error when i create the index manually.
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)
To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.
The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.
Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.
Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.