Searching with _id throws NullPointerException

Hi,
As you know, when I leave id generation to elasticsearch it provides some
unique strings to my entries like "FRS94Gd3SL-DCjYE9ld-7Q", "
-k5EX5ENRMSOQZSYj-xXRw", etc. Searching with id with the following query is
ok:

_id:FRS94Gd3SL-DCjYE9ld-7Q

But when an id starts with a minus sign elasticsearch throws a null pointer
exception:

_id:-k5EX5ENRMSOQZSYj-xXRw

Here is the exception:

{

  • error: "SearchPhaseExecutionException[Failed to execute phase [query],
    total failure; shardFailures {null:
    ArrayIndexOutOfBoundsException[57]}{null:
    ArrayIndexOutOfBoundsException[57]}{null:
    ArrayIndexOutOfBoundsException[57]}{null:
    ArrayIndexOutOfBoundsException[57]}{null:
    ArrayIndexOutOfBoundsException[57]}]"

}

Thanks in advance,

Sezgin Kucukkaraaslan
www.ifountain.com

Hiya

But when an id starts with a minus sign elasticsearch throws a null
pointer exception:

_id:-k5EX5ENRMSOQZSYj-xXRw

The exception itself should be caught and transformed into a better
error message (please open an issue for this:
Issues · elastic/elasticsearch · GitHub )

However, the reason for the error is this: a leading '-' is a special
character in the lucene query parser
http://lucene.apache.org/java/3_0_0/queryparsersyntax.html

It means "exclude results containing this term", so effectively a search
for '_id:-k5EX5ENRMSOQZSYj-xXRw' means:

"give me not k5EX5ENRMSOQZSYj-xXRw"

but without specifying any other clause that SHOULD match, so ES doesn't
know what to exclude k5EX5ENRMSOQZSYj-xXRw from.

You can get around that by escaping the '-', eg:

    curl 'http://localhost:9200/_search?q=_id:
    \-k5EX5ENRMSOQZSYj-xXRw'

Alternatively, you could just request that doc:

    curl 'http://localhost:9200/myindex/mytype/-k5EX5ENRMSOQZSYj-xXRw' 

Or, you could use a term query:

    curl -XGET 'http://localhost:9200/_all/_search'  -d '
    {
       "query" : {
          "term" : {
             "_id" : "-k5EX5ENRMSOQZSYj-xXRw"
          }
       }
    }
    '

clint

Hey,

Yea, that failure is ugly..., I fixed it to have the proper Lucene query
parsing failure (which is waaay too detailed :wink: ).

The id generation in elasticsearch has changed, it is still a 128 UUID
generation, but it is encoded using baset64 (modified to be URL friendly).
Was thinking long on which encoding should be used instead of how it was
represented before (which is waay too verbose and too long), and thought
that base64 (modified to be URL friendly) is the best choice. Is more than
open to other ideas if someone has them (for UUID based representation, I
know that generating unique ids can be done in other ways).

-shay.banon

On Mon, Nov 22, 2010 at 6:12 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

Hiya

But when an id starts with a minus sign elasticsearch throws a null
pointer exception:

_id:-k5EX5ENRMSOQZSYj-xXRw

The exception itself should be caught and transformed into a better
error message (please open an issue for this:
Issues · elastic/elasticsearch · GitHub )

However, the reason for the error is this: a leading '-' is a special
character in the lucene query parser
Apache Lucene - Query Parser Syntax

It means "exclude results containing this term", so effectively a search
for '_id:-k5EX5ENRMSOQZSYj-xXRw' means:

"give me not k5EX5ENRMSOQZSYj-xXRw"

but without specifying any other clause that SHOULD match, so ES doesn't
know what to exclude k5EX5ENRMSOQZSYj-xXRw from.

You can get around that by escaping the '-', eg:

   curl 'http://localhost:9200/_search?q=_id:
   \-k5EX5ENRMSOQZSYj-xXRw'

Alternatively, you could just request that doc:

   curl 'http://localhost:9200/myindex/mytype/-k5EX5ENRMSOQZSYj-xXRw'

Or, you could use a term query:

   curl -XGET 'http://localhost:9200/_all/_search'  -d '
   {
      "query" : {
         "term" : {
            "_id" : "-k5EX5ENRMSOQZSYj-xXRw"
         }
      }
   }
   '

clint