Bulk inserts more documents than given

Elasticsearched configured in single-node mode, I have ~1 million elements, but after bulk insert operation I see 10 million elements. I use this python code:

def generate_docs(data):
    for item in data:
        doc = {
            '_index': 'my_index',
            '_source': item
        }

bulk(client, generate_docs)

What a reason of this duplication and is it a real problem?

Where do you see 10m?

Are you using nested field type in your mapping?

I see it when call _cat/indices. And yes, I use nested field type

So that's expected.

A nested document is a Lucene document. Which you see in the cat API.
If you run a search, you should get the right number.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.