Elasticsearched configured in single-node mode, I have ~1 million elements, but after bulk insert operation I see 10 million elements. I use this python code:
def generate_docs(data):
for item in data:
doc = {
'_index': 'my_index',
'_source': item
}
bulk(client, generate_docs)
What a reason of this duplication and is it a real problem?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.