NumberFormatException when sorting by numeric document ID

Benji_Smith · February 13, 2014, 12:03am

Hello hello! I have a bizarre error I've been trying to debug for a few
weeks with no luck, and I'm finally left to conclude that it may be a bug
in ElasticSearch.

Once every few days, I start seeing shard failures in my query results,
like this:

{
"index": "my_index",
"shard": 3,
"status": 500,
"reason": "RemoteTransportException[[HOSTNAME][inet[/10.0.123.123:9300]][search/phase/query]]; nested: QueryPhaseExecutionException[[my_index][3]: query[ConstantScore(cache(_type:my_type))],from[0],size[10],sort[<custom:"id": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@97c2b4f>]: Query Failed [Failed to execute main query]]; nested: ElasticSearchException[java.lang.NumberFormatException: Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; nested: UncheckedExecutionException[java.lang.NumberFormatException: Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; nested: NumberFormatException[Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; "
}

This query is operating against an index with about 100 different fields
(including several different nested types), but the relevant portion of the
mapping looks like this:

{
  "my_type" : {
  "_id"           : { "type" : "long", "path" : "id" },
  "properties"    : {
    "id"            : { "type" : "long" },
    /* ... LOTS OF OTHER FIELDS, INCLUDING MANY NESTED TYPES */
  }}
}

I've been able to isolate the shard failures to a minimal query of this
form:

{
  "query" : { "match_all" : { } },
  "sort" : [{
    "id" : { "order" : "asc" }
  }]
}

Basically, sorting by (numeric) ID causes shard failures when the shards
sometimes mistakenly think that there are non-numeric values in the "id"
field. I've audited the data, and it conforms with the schema. The id
fields always contain valid LONG values.

Whenever the shard failures occur, I can silence them for a few days by
optimizing the index, like this:

curl -XPOST 'http://HOSTNAME:9200/my_index/_optimize?max_num_segments=1'

And the shard failures will stop for a day or two, but inevitably, within a
few days the failures will return and I'll have to optimize the index
again. The weird thing is that the status URL always reports GREEN status
and all shards healthy, even when these queries are failing on every
request.

I experienced these failures originally on 0.90.5, but I continued seeing
the same problems after recently upgrading to 0.90.10. I even deleted the
index and rebuilt from scratch under 0.90.10, but I've kept seeing the same
failures.

Any idea what might be going on?

Thanks!

benji smith

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3d8da85f-ed52-4127-b283-94998e851713%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · February 13, 2014, 9:11am

Hey,

is there a complete stack trace in the elasticsearch log files available
you could post here?
Also, your query is only spanning one index and one type? If not can you
provide other mappings? (I dont think this is an issue here, when you say
that optimizing it down to one segment makes it work again, just want to
exclude things). Is it possible that there is one type in your index, that
has a different id mapping? And that type has been deleted again? This
would explain why it works after an optimize...

--Alex

On Thu, Feb 13, 2014 at 1:03 AM, Benji Smith benji@benjismith.net wrote:

Hello hello! I have a bizarre error I've been trying to debug for a few
weeks with no luck, and I'm finally left to conclude that it may be a bug
in Elasticsearch.

Once every few days, I start seeing shard failures in my query results,
like this:

{
"index": "my_index",
"shard": 3,
"status": 500,
"reason": "RemoteTransportException[[HOSTNAME][inet[/10.0.123.123:9300]][search/phase/query]]; nested: QueryPhaseExecutionException[[my_index][3]: query[ConstantScore(cache(_type:my_type))],from[0],size[10],sort[<custom:"id": org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@97c2b4f>]: Query Failed [Failed to execute main query]]; nested: ElasticSearchException[java.lang.NumberFormatException: Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; nested: UncheckedExecutionException[java.lang.NumberFormatException: Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; nested: NumberFormatException[Invalid shift value (64) in prefixCoded bytes (is encoded value really an INT?)]; "
}

This query is operating against an index with about 100 different fields
(including several different nested types), but the relevant portion of the
mapping looks like this:
{
  "my_type" : {
  "_id"           : { "type" : "long", "path" : "id" },
  "properties"    : {
    "id"            : { "type" : "long" },
    /* ... LOTS OF OTHER FIELDS, INCLUDING MANY NESTED TYPES */
  }}
}
I've been able to isolate the shard failures to a minimal query of this
form:
{
  "query" : { "match_all" : { } },
  "sort" : [{
    "id" : { "order" : "asc" }
  }]
}
Basically, sorting by (numeric) ID causes shard failures when the shards
sometimes mistakenly think that there are non-numeric values in the "id"
field. I've audited the data, and it conforms with the schema. The id
fields always contain valid LONG values.

Whenever the shard failures occur, I can silence them for a few days by
optimizing the index, like this:
curl -XPOST '
http://HOSTNAME:9200/my_index/_optimize?max_num_segments=1'

And the shard failures will stop for a day or two, but inevitably, within
a few days the failures will return and I'll have to optimize the index
again. The weird thing is that the status URL always reports GREEN status
and all shards healthy, even when these queries are failing on every
request.

I experienced these failures originally on 0.90.5, but I continued seeing
the same problems after recently upgrading to 0.90.10. I even deleted the
index and rebuilt from scratch under 0.90.10, but I've kept seeing the same
failures.

Any idea what might be going on?

Thanks!

benji smith

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3d8da85f-ed52-4127-b283-94998e851713%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_bSg1WNQWwcHeTfXgUpE0TczNv%3DSByfBriNCf-SVLWEg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Binh_Ly · February 13, 2014, 12:45pm

Like Alex mentioned, I would check all the mappings to ensure the types of
the id field are all the same (doesn't matter what the value is in it -
what matters is the type defined in the mapping). Your error message means:
in one type the id field is long (in the mapping), and in the other type
the id field is int (in the mapping). And you are querying across those 2
types which gives this error.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5a86eafd-5c0f-4895-be49-650dca9db892%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Benji_Smith · February 13, 2014, 3:48pm

Yes, you guys are right. There are multiple different types in this index,
and some of them have LONG ids, while others have INTEGER or STRING ids.
I'll have to redesign a few parts of the mappings to fix that problem.

I suppose the same restriction applies across nested types as well, right?
I'm heavily using nested types, and many of them have their own inner ID
fields.

Thanks for your help!

benji

On Thursday, February 13, 2014 7:45:37 AM UTC-5, Binh Ly wrote:

Like Alex mentioned, I would check all the mappings to ensure the types of
the id field are all the same (doesn't matter what the value is in it -
what matters is the type defined in the mapping). Your error message means:
in one type the id field is long (in the mapping), and in the other type
the id field is int (in the mapping). And you are querying across those 2
types which gives this error.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/63afc704-3d2e-452b-8ebc-808d9d3e3dfc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Benji_Smith · February 13, 2014, 4:03pm

Are there any ES committers on the mailing list who wouldn't mind
commenting on this issue?

If I have two different types "user" and "session", and both of them have
and "id" field, shouldn't Elasticsearch understand that those are two
different fields, and that their fully-qualified names are actually
"user.id" and "session.id"? Using only fully-qualified names in the lucene
internals seems like a straightforward way to fix the problem.

Incidentally, it looks like there's a bug-report (submitted two YEARS ago!)
here:

github.com/elastic/elasticsearch

sorting failed when same name under different types with different mapping

opened 04:39PM - 24 Feb 12 UTC

closed 02:25PM - 08 Jul 14 UTC

medcl

if there are same fieldname with different data type in different type, for exam…ple,datetime and string, type_a have field:time,data type is date type_b have field:time,data type is string,(this type may be created dynamic) if i want to query against type_a use this query: http://localhost:9200/index/type_a/_search?q=*&sort=time perhaps,you may get this error(not always,occasional after a restart) {"error":"SearchPhaseExecutionException[Failed to execute phase [query], total failure; shardFailures {[Pqdw_LAFSbOfyo9yVU9aaw][index][0]: QueryPhaseExecutionException[[index][0]: query[ConstantScore(_:_)],from[0],size[10],sort[<custom:\"time\": org.elasticsearch.index.field.data.strings.StringFieldDataType$1@1693f17f>]: Query Failed [Failed to execute main query]]; nested: IOException[Can't sort on string types with more than one value per doc, or more than one token per field]; }{[Pqdw_LAFSbOfyo9yVU9aaw][index][1]: QueryPhaseExecutionException[[index][1]: query[ConstantScore(_:_)],from[0],size[10],sort[<custom:\"time\": org.elasticsearch.index.field.data.strings.StringFieldDataType$1@3a18c8ca>]: Query Failed [Failed to execute main query]]; nested: IOException[Can't sort on string types with more than one value per doc, or more than one token per field]; }{[Pqdw_LAFSbOfyo9yVU9aaw][xxx][2]: QueryPhaseExecutionException[[index][2]: query[ConstantScore(_:_)],from[0],size[10],sort[<custom:\"time\": org.elasticsearch.index.field.data.strings.StringFieldDataType$1@31266392>]: Query Failed [Failed to execute main query]]; nested: IOException[Can't sort on string types with more than one value per doc, or more than one token per field]; }]","status":500} this line: https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/sort/SortParseElement.java#L159 it returned the wrong FieldMapper,it should be "LongFieldMapper" but actually returned is :"StringFieldMapper",did'nt load the right data,so sorting failed , with the given parameter "fieldName",it didn't know which mapping will be used,and elasticsearch should be more smart,to choose the right mapping,in this case,if we have explicit specify the type(in url:http://localhost:9200/index/type_a/_search),es can select the mapping within type:type_a.

If this is the desirable behavior, then why hasn't this bug been closed as
"won't fix"? Or if it's legitimately a bug, why wasn't it fixed before
releasing 1.0? It seems like a pretty fundamental flaw in the system that
the functionality of one type can be broken by the definition of another
essentially unrelated type.

I can understand why the behavior is what it is, historically, but it seems
self-evidently like a bug. In what kind of system would this be the
desirable behavior?

Thanks!

benji

On Thursday, February 13, 2014 10:48:56 AM UTC-5, Benji Smith wrote:

Yes, you guys are right. There are multiple different types in this index,
and some of them have LONG ids, while others have INTEGER or STRING ids.
I'll have to redesign a few parts of the mappings to fix that problem.

I suppose the same restriction applies across nested types as well, right?
I'm heavily using nested types, and many of them have their own inner ID
fields.

Thanks for your help!

benji

On Thursday, February 13, 2014 7:45:37 AM UTC-5, Binh Ly wrote:

Like Alex mentioned, I would check all the mappings to ensure the types
of the id field are all the same (doesn't matter what the value is in it -
what matters is the type defined in the mapping). Your error message means:
in one type the id field is long (in the mapping), and in the other type
the id field is int (in the mapping). And you are querying across those 2
types which gives this error.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/24c29e85-6a94-426d-9746-71960b23fd4c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Binh_Ly · February 13, 2014, 4:21pm

This is on the plate. I'm not 100% sure exactly what the fix will be but it
could be something along the lines of a warning when a mapping is
introduced with the same field name but different types.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8495900e-e4d7-4c9b-9c8c-d794bb94029e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Benji_Smith · February 13, 2014, 5:18pm

Thanks for your comment! Looks like correct github issue to reference is
this one:

github.com/elastic/elasticsearch

Field resolution should be unambiguous

opened 06:36PM - 04 Nov 13 UTC

closed 01:54PM - 10 Dec 14 UTC

clintongormley

>breaking

As far as I understand it, fields are resolved on a _first found_ basis. So give…n the following documents: ``` PUT /index/foo/1 { "count": 1, "foo": { "count": 1 } } PUT /index/bar/2 { "count": 1, "foo": { "count": 1 } } ``` .... the field `foo.count` could resolve to `foo.count`, `foo.foo.count`, or `bar.foo.count`, depending on which is found first. Field resolution should be unambiguous. Field names should be grouped by type and sorted in descending order by number of `.`. So the above mappings should result in: ``` bar: foo.count count foo: foo.count count ``` Then if no type is specified (or multiple types are specified), go through the groups in alphabetical order. This would result in the following resolutions: ``` foo.foo.count => (foo) foo.count foo.count => (foo) count count => (bar) count *.foo.count => (bar) foo.count *.count => (bar) count *.*.count => (bar) foo.count ```

I've added my comments, and I'm rooting for a solution to this problem
rather than just a warning, which won't really solve the problem for us.
Fingers crossed!

benji

On Thursday, February 13, 2014 11:21:08 AM UTC-5, Binh Ly wrote:

This is on the plate. I'm not 100% sure exactly what the fix will be but
it could be something along the lines of a warning when a mapping is
introduced with the same field name but different types.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c1cb37bc-1698-4b03-9791-535e9f12bbf0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · February 13, 2014, 11:41pm

I doubt this issue will ever be "fixed" since the limitation exists in
Lucene. All types belong to the same index and a field's data needs to be
uniform in Lucene's eyes. A document's type is used to indicate different
mappings for a document, but not different ways to segment the data types
in the index itself. This scenario should be documented however, so that
others do not fall into the same trap.

--
Ivan

On Thu, Feb 13, 2014 at 9:18 AM, Benji Smith benji@benjismith.net wrote:

Thanks for your comment! Looks like correct github issue to reference is
this one:

Field resolution should be unambiguous · Issue #4081 · elastic/elasticsearch · GitHub

I've added my comments, and I'm rooting for a solution to this problem
rather than just a warning, which won't really solve the problem for us.
Fingers crossed!

benji

On Thursday, February 13, 2014 11:21:08 AM UTC-5, Binh Ly wrote:

This is on the plate. I'm not 100% sure exactly what the fix will be but
it could be something along the lines of a warning when a mapping is
introduced with the same field name but different types.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c1cb37bc-1698-4b03-9791-535e9f12bbf0%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAdUBCn3jyjT_%3D51TXUQm4N4KdYYHrExWx4L7%2BxypKBKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Benji_Smith · February 14, 2014, 4:32pm

This can absolutely be fixed in Elasticsearch. It's not a problem
with Lucene, but with how ES data is mapped onto the Lucene data model.

The problem is that types and fields use local names instead of
fully-qualified names. As far as Lucene is concerned, there would be a
field named "user.id" mapped as a long, another field named "product.id"
mapped as a string, and a nested type named "user.address.id" mapped as an
integer. Under this kind of system, "user" and "product" can exist in the
same index, without even the possibility that their names and types would
clash.

benji

On Thursday, February 13, 2014 6:41:41 PM UTC-5, Ivan Brusic wrote:

I doubt this issue will ever be "fixed" since the limitation exists in
Lucene. All types belong to the same index and a field's data needs to be
uniform in Lucene's eyes. A document's type is used to indicate different
mappings for a document, but not different ways to segment the data types
in the index itself. This scenario should be documented however, so that
others do not fall into the same trap.

--
Ivan

On Thu, Feb 13, 2014 at 9:18 AM, Benji Smith <be...@benjismith.net<javascript:>

wrote:

Thanks for your comment! Looks like correct github issue to reference is
this one:

Field resolution should be unambiguous · Issue #4081 · elastic/elasticsearch · GitHub

I've added my comments, and I'm rooting for a solution to this problem
rather than just a warning, which won't really solve the problem for us.
Fingers crossed!

benji

On Thursday, February 13, 2014 11:21:08 AM UTC-5, Binh Ly wrote:

This is on the plate. I'm not 100% sure exactly what the fix will be but
it could be something along the lines of a warning when a mapping is
introduced with the same field name but different types.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c1cb37bc-1698-4b03-9791-535e9f12bbf0%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2934c18e-fce5-4ffa-9fe4-b0115d53e2f9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Shard error on elasticsearch upgrade Elasticsearch	3	620	July 6, 2017
500 "Number format exception" on Terms Aggregation Elasticsearch	2	808	July 6, 2017
Number Format Exception? Elasticsearch	7	4163	July 6, 2017
Search exceptions, is value really an INT (field names/type) Elasticsearch	3	2925	July 5, 2017
How can I break down and diagnose this query error resulting in a NumberFormatException? Elasticsearch	1	356	July 6, 2017

NumberFormatException when sorting by numeric document ID

Related topics