Using php client bulk() - ERROR: 400 parse_exception: Failed to derive xcontent

Using the 2.0 php client bulk() I'm getting "ERROR: 400 parse_exception: Failed to derive xcontent" on some documents. The linked document below consistently hits this exception. My ES instance is running 2.2.0. Before the bulk() is called I can see the document structure in php, it's like the other documents before it in the bulk queue, but this hits the exception. What's interesting though, is if I limit my bulk() batch to just 1 document in each bulk(), the document array exists, then after the exception the document array passed to bulk() is empty. The php client is emptying the array for some reason.

The document, however, is being created. Seems like I can simply ignore the error?

In Sense I can create the document no problem, no error.

I see a lot of posts on this Failed to derive xcontent, but nothing seems to apply to my case that I've found so far.

The forum wouldn't allow me to insert the document in question into my post here, which is pretty large, so here's a link: https://www.ezrackbuilder.com/large-document.txt

Wish I could delete a post. There was an error in our php script that was causing this issue. No problem with php client or ES in my case.

What was the error? It might help someone in the future :slight_smile:

Sure. So my issue was this:

We were following the concept on the https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_indexing_documents.html bulk indexing in batches, which is below.

However, we stupidly had the // Send the last batch if it exists code inside of our foreach loop. So, when our foreach loop hit it's max documents before calling bulk, the documents had just been emptied out by the other call. If you attempt to bulk empty docs, you get the error we were seeing.

$params = ['body' => []];

for ($i = 1; $i <= 1234567; $i++) {
    $params['body'][] = [
        'index' => [
            '_index' => 'my_index',
            '_type' => 'my_type',
            '_id' => $i
        ]
    ];

    $params['body'][] = [
        'my_field' => 'my_value',
        'second_field' => 'some more values'
    ];

    // Every 1000 documents stop and send the bulk request
    if ($i % 1000 == 0) {
        $responses = $client->bulk($params);

        // erase the old bulk request
        $params = ['body' => []];

        // unset the bulk response when you are done to save memory
        unset($responses);
    }
}

// Send the last batch if it exists
if (!empty($params['body'])) {
    $responses = $client->bulk($params);
}
2 Likes