The code was supposed to be the _id and should not be null but this data is coming from an old database and unfortunately some entries were null, Yes, Elasticsearch creates random _id but the i_d is kinda important, I do not think it should create this silently.
Is this an issue or did i do this wrong?
Thanks everyone
For more information, i was using Python - Elasticsearch client.
_id is mandatory in every document. So either you supply one, and clearly has to be non-null, or elasticsearch creates it (not quite randomly, but leave that aside for now).
The python client + elasticsearch seemed to do this, according to my understanding of what you wrote?
What did you want done differently? Or just do exactly the same, but more noisily ?
I don’t know the python library specifics, but if you do same via curl elasticsearch replies telling you it created a document and the _id it used.
Note this is a very typical use case, many applications don’t supply an _id and there are some downsides to using/managing _id on your own. So I’d personally not characterise a warning as being necessary. But I’m sure setting some log level to info or trace could be used.
You can of course code your application to check for non-null _id and handle those cases differently. But tons of stuff relies on elasticsearch doing exactly what it seems to have done in your scenario - so I’d characterise that it was all working consistently with its own documentation?
YOU specifically asked:
“ Is this an issue or did i do this wrong?”
and IMO the answers are no, it’s not an issue, and no, you did not do anything wrong.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.