Bulk API "lies" in the response

Hello,

I'm having some problems using the Bulk API because it is lying in the response it returns.

I'm using the following request:

    POST /_bulk?refresh=true
    { "index": { "_index": "my_index", "_id": 123 } },
    { "internal_id": 123, "attr1": "value1", "attr2": "value2", ... }

And I'm getting the response below:

    {
      "took": 16,
      "errors": false,
      "items": [
        {
          "index": {
            "_index": "my_index",
            "_type": "_doc",
            "_id": "123",
            "_version": 1,
            "result": "created",
            "forced_refresh": true,
            "_shards": {
              "total": 1,
              "successful": 1,
              "failed": 0
            },
            "_seq_no": 21744,
            "_primary_term": 3,
            "status": 201
          }
        }
      ]
    }

That looks great but the actual issue is that when I go to "my_index" a search for document with id 123, there is no document.

In fact, the number of documents of the index does not change.

The "funny" thing is that this behaviour only happens when I launch the request from a Google Cloud Function but if I launch exactly the same request from my local computer, the document is successfully indexed.

Can anyone help me to identify the problem? I'm going crazy...

Thanks in advance!

Can you provide a full reproduction of the requests that you are running for the index and the search, along with the respones?

I suspect that at the point of searching, the index has not been refreshed, so the document does not yet appear in search results. If you fetch the document using its ID with the Get API, it'll be there.

Thanks for the quick response.

Here is a real request (just changed the values of the attributes and the endpoint, that is in Elastic Cloud as you can see):

POST https://my_elastic_cloud_id.gcp.cloud.es.io:9243/_bulk?refresh=true
{"index": {"_index": "my_production_index", "_id": 4789467775055}}
{"store": "STORENAME", "my_id": 4789467775055, "product": {"pname": "the name", "pbrand": "the brand", "pphotos": ["https://url.to.photo.1.jpg", "https://url.to.photo.2.jpg"], "pid": "140205970", "pcategory": "the category", "pdesc": "the text of the description", "pvariants": [{"size": "40", "qty": 5}, {"size": "41.5", "qty": 10}], "pcompareatprice": 127, "ptags": ["tag1", "tag2"], "gender": "Female", "age": "Adult", "pprice": 78, "pdiscount": 39, "pdate": "2020-09-16 08:15:48"}}

And this is the actual response to that request:

{
  "took": 13,
  "errors": false,
  "items": [
    {
      "index": {
        "_index": "my_production_index",
        "_type": "_doc",
        "_id": "4789467775055",
        "_version": 1,
        "result": "created",
        "forced_refresh": true,
        "_shards": {
          "total": 1,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 813,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}

I also thought in the refresh thing that point @forloop, but as you can see in the request, I force it.

Further, I do check the number of documents of the index before and after the indexing and it doesn't change (being 731 both times):

GET /_cat/indices

The query I'm using to check the document is (launched from the Kibana console):

GET my_production_index/_search
{
  "query": { 
      "ids": { 
          "values": [ 4789467775055 ]
      }
  },
  "size" : 1
}

And the response is:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

And, as I said in the first post, if I do repeat exactly the same request from my laptop or the Kibana console, the request works fine and index the document, but when I launch it from the Google Cloud Function, I get a valid response but the document is not indexed.

Any idea about how I can detect and fix the issue?

Many thanks!

1 Like

I just ran the exact commands you posted (thanks for that!) and my response was;

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my_production_index",
        "_type" : "_doc",
        "_id" : "4789467775055",
        "_score" : 1.0,
        "_source" : {
          "store" : "STORENAME",
          "my_id" : 4789467775055,
          "product" : {
            "pname" : "the name",
            "pbrand" : "the brand",
            "pphotos" : [
              "https://url.to.photo.1.jpg",
              "https://url.to.photo.2.jpg"
            ],
            "pid" : "140205970",
            "pcategory" : "the category",
            "pdesc" : "the text of the description",
            "pvariants" : [
              {
                "size" : "40",
                "qty" : 5
              },
              {
                "size" : "41.5",
                "qty" : 10
              }
            ],
            "pcompareatprice" : 127,
            "ptags" : [
              "tag1",
              "tag2"
            ],
            "gender" : "Female",
            "age" : "Adult",
            "pprice" : 78,
            "pdiscount" : 39,
            "pdate" : "2020-09-16 08:15:48"
          }
        }
      }
    ]
  }
}

What version are you on?

I'm in 7.9, just migrated last week.

From my laptop, I get that same result, but not from the server.

Is there any way to debug what it is actually happening?

Possibly-silly question, but are you sure you're indexing into the same cluster as the one against which you're subsequently searching? Check GET _cluster/state/nodes from both locations and verify that they're completely identical.

Hello,

I finally found the problem.

After indexing the document, another process was automatically triggered and it was deleting the indexed document, that's the reason I couldn't find it and everything looked fine during indexing.

Sorry to bother you and thanks again for your help!

1 Like