How can I get the document meta data after saving a document in elasticsearch?

I am saving documents to an index in Elasticsearch using the bulk API in a Python project. However, what I need is how can I return the created id of the document after this process?

success, failed = bulk(client, actions, refresh=True)

            result = {
                "success_count": success,
                "failed_count": len(failed),
                "success_data": actions if success == len(actions) else [],
                "failed_data": failed
            }

Hi @Mertozturkk,

Wwlcome to the community! Have you checked the response body of the build request?

Looking at the bulk API documentation there is an items collection in the response where the impacted document id for each action is included.

Can you take a look and see if you can extract that from the response?

Yes, the information in the documentation is exactly what I need. However, when I run the code on the source code side in the Python library, what I see is that the data I need remains in the code and it only returns me the number of successful operations. If there is nothing else I missed about this, I will create a solution by contributing to the source code.

Actually, the information I want is in the item in the source code, but I cannot solve the problem because it is not returned, how can I proceed? I wanted to open a PR, I can get the information with a simple method, but I think there is a process that takes a little longer in the background.

success, failed = 0, 0

    # list of errors to be collected is not stats_only
    errors = []

    # make streaming_bulk yield successful results so we can count them
    kwargs["yield_ok"] = True
    for ok, item in streaming_bulk(
        client, actions, ignore_status=ignore_status, *args, **kwargs  # type: ignore[misc]
    ):
        # go through request-response pairs and detect failures
        if not ok:
            if not stats_only:
                errors.append(item)
            failed += 1
        else:
            success += 1

    return success, failed if stats_only else errors

Elasticsearch version (8.10.0):

elasticsearch-py version (8.10.0):

You can get the IDs as you bulk index documents:

for ok, document in streaming_bulk(client, actions=actions, index="test-index"):
    print(document["index"]["_id"])

Alternatively, you can specify the IDs you want for the documents yourself so you don't have to retrieve them after the fact. Here's an example of that with a custom iterable function:

1 Like

Thank you, I missed the values that streaming bulk returns, this will make my job easier.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.