Random non-URI encoding of document _id

Using a combination of NEST (v5.0.0) and ES (v5.1.2), I am experiencing random situations of non-URI encoding when indexing a document. The document Id includes a plus-sign (+):

var indexRequest = new IndexRequest<MyDocument>(myDocumentInstance, id: "a+b");
await Client.IndexAsync(indexRequest);

Sometimes the document _id in ES gets stored as-expected. Other times, the document _id is a b. The latter behavior results in having a total of two-documents. I cannot readily duplicate the problem. I've tried writing several variations of tests to try and trigger this behavior, but nothing works. Also found an older issue related to URI encoding (ref. https://github.com/elastic/elasticsearch-net/issues/1768).

I can manually re-create the experience when issuing a command like:

POST /my-index/my-type/a+b
{
}

results in:

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "a b",
   "_version": 1,
   "result": "created",
   "_shards": {
      "total": 2,
      "successful": 1,
      "failed": 0
   },
   "created": true
}

and encoding the URI:

POST /my-index/my-type/a%2bb
{
}

results in:

{
   "_index": "my-index",
   "_type": "my-type",
   "_id": "a+b",
   "_version": 1,
   "result": "created",
   "_shards": {
      "total": 2,
      "successful": 1,
      "failed": 0
   },
   "created": true
}

Anyone else ever come across this? I have since switched to underscores and no longer have a problem, but would really like to know the root-cause.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.