_id field limitations

Hi,

When using the _bulk API its possible to index a document with _id which contains malformed url characters such as '' and white spaces.

Later if one tries to use the GET API and reference the document _id elastic returns

GET testindex/testindex/sometext\\some text
No handler found for uri [testindex/testindex/sometext\\some text] and method [GET]

If one tries to PUT a document to and index with such a document _id you get
PUT testindex/sometext\\some text

{
  "error": "Content-Type header [text/plain] is not supported",
  "status": 406
}

if one tries to PUT a document with '' characters
PUT textindex10/sometext\\sometext

No handler found for uri [/testindex/sometext//sometext] and method [PUT]

However using the query ids API, the document can be found using such an _id.

I could not find any Elastic documentation on _id field limitations...

The PUT testindex/sometext\\some text error message is unrelated.

It's just because you don't pass the Content-Type in your curl command IMO.

About the _id limitations, yes there are.
I "think" that if you URLEncode the id, that should be ok. Unsure though.

FWIW, I believe you should open an issue so we try to reject properly that case if needed.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.