Hive to elasticsearch Parsing exception

P_lva · March 12, 2015, 8:12pm

Hello Everyone,

I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems and
I can see all the data in ES.

However, I get the following error when i create the index manually.

Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)

To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.

The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.

Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.

Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO9TxdO22hy2%3Dcz1S_DJgvtd0rsw%2Bu0WL8SqLFR8GTbbGJr9EQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

costin · March 13, 2015, 4:08am

Likely the issue is caused by the fact that in your manual mapping, the
"NULL" value is not actually mapped to null but actually to a string value.
You should be able to get around it by converting "NULL" to a proper NULL
value which es-hadoop can recognized; additionally you can 'translate' it
to a default one.

As for understanding what field caused the exception, unfortunately
Elasticsearch doesn't provide enough information about this yet but it
should. Can you please raise a quick issue on es-hadoop about this?

Thanks,

On Thu, Mar 12, 2015 at 10:12 PM, P lva ruvikal@gmail.com wrote:

Hello Everyone,

I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems
and I can see all the data in ES.

However, I get the following error when i create the index manually.

Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)

To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.

The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.

Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.

Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9TxdO22hy2%3Dcz1S_DJgvtd0rsw%2Bu0WL8SqLFR8GTbbGJr9EQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAO9TxdO22hy2%3Dcz1S_DJgvtd0rsw%2Bu0WL8SqLFR8GTbbGJr9EQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJogdmderE6q3w0mJytbmfKkYHyegs7zwi9x5wtOe9G_MWKEyw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

P_lva · March 13, 2015, 2:51pm

Ignoring both null values and "null strings" worked.

Will open a issue about this.

Thanks a lot Costin.

On Fri, Mar 13, 2015 at 12:08 AM, Costin Leau costin.leau@gmail.com wrote:

Likely the issue is caused by the fact that in your manual mapping, the
"NULL" value is not actually mapped to null but actually to a string value.
You should be able to get around it by converting "NULL" to a proper NULL
value which es-hadoop can recognized; additionally you can 'translate' it
to a default one.

As for understanding what field caused the exception, unfortunately
Elasticsearch doesn't provide enough information about this yet but it
should. Can you please raise a quick issue on es-hadoop about this?

Thanks,

On Thu, Mar 12, 2015 at 10:12 PM, P lva ruvikal@gmail.com wrote:

Hello Everyone,

I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4).
With the auto create index option turned on , I don't face any problems
and I can see all the data in ES.

However, I get the following error when i create the index manually.

Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
unrecoverable error [Bad Request(400) - [MapperParsingException[failed to
parse]; nested: NumberFormatException[For input string: "NULL"]; ]];
Bailing out..
at
org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
at
org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
at
org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
at
org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:152)
at
org.elasticsearch.hadoop.rest.RestRepository.writeProcessedToIndex(RestRepository.java:146)
at
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:63)
at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:621)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at
org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)

To create the index manually, I've used the same mappings from the first
auto create step and changed one field to geo point type.
Changing the field type is the only change I made.

The column that I wanted to be geo fields had a few nulls, so i selected
rows without nulls and still have the same error.

Is there any way to identify which column is causing the issue ? There's
about 70 columns in my table.

Tl;dr
Hive table to elasticsearch
Auto create index works fine
Fails when I manually created index with almost same mapping (except one
field changed from string to geopoint)

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9TxdO22hy2%3Dcz1S_DJgvtd0rsw%2Bu0WL8SqLFR8GTbbGJr9EQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAO9TxdO22hy2%3Dcz1S_DJgvtd0rsw%2Bu0WL8SqLFR8GTbbGJr9EQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJogdmderE6q3w0mJytbmfKkYHyegs7zwi9x5wtOe9G_MWKEyw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAJogdmderE6q3w0mJytbmfKkYHyegs7zwi9x5wtOe9G_MWKEyw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO9TxdOwO41e3NXm-pmohyGY8TjSF-RnB4kc1S%2B7U3Hm3cZkuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
From Hive to ES :EsHadoopException: Could not write all entries for bulk operation Elasticsearch	1	1060	March 22, 2019
Data from hive table Elasticsearch	2	431	July 6, 2017
Cannot handle type TimestampWritableV2 Exception when creating index with ES Hadoop connector Elasticsearch es-hadoop	2	1169	January 9, 2019
ElasticsearchHadoop Hive integration issue Elasticsearch	3	683	July 6, 2017
Elasticsearch-hadoop-hive exception when writing array<map<string,string>> column Elasticsearch	3	1141	July 6, 2017

Hive to elasticsearch Parsing exception

Related topics