Elasticsearch Bulk Insert through Python Client API

Dear Team,
I am trying to bulk insert 3000 records into an elastic search using BULK API. Whichever documents having the keyword version are not being inserted.
For eg:
[{"name": "XXXX", "description": "XXXX", "_id": "53a8b9a3-7f7c-4dac-bc6f-15c48b3bcbd9", "source": 2, "source_id": "jk4jJXgB_6QR5Hf9NoZA", "c": 1617367373, "u": 1618978346}]
The above record is inserted into the elastic search.
{"name": "Git version 2.31.1", "publisher": "The Git Development Community", "path": "XXX", "version": "2.31", "_id": "d0ec7bab-fa59-46a1-93c2-94c0925e416f", "companyRef": {"id": "53a8b9a3-7f7c-4dac-bc6f-15c48b3bcbd9", "name": "XXX"}, "agentRef": {"id": "606711592b2f4b23de4f6291", "name": "XXX"}, "assetRef": {"id": "53a8b9a3-7f7c-4dac-bc6f-15c48b3bcbd9_000C2964CC22", "name": "XXX"}, "u": XXX, "c": XXX}
The above record is not getting inserted.
Can someone please help.

please share error messages, that are returned from such requests. Thanks!

Hi
Sorry for the delayed response.
Below is the error message.
elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'For input string: ""11.4.7001.0""')

how is that field mapped in your index?

It mapped in my index as a string.

I somewhat doubt this (the exception looks like a number format exception), can you provide the full stacktrace as the involved field is not shown and a reproduction, that is not bound to a programming language but can be done via Kibana Dev Tools?

Hi
I am able to insert the data, by looping through and inserting one by one. And i am able to insert the data using the es-pandas module as well.

If the above exception happens during an index operation, then it is unlikely that this document has been indexed, as you mentioned in the first post.

If the above exception does not occur when you index one-by-one I assume that your mapping at some point is dynamic where it should not be. That means, that the value of a field is determiend by the first document that contains this value. If that happens to be a string, then you will not be able to index anything as a number and vice versa.

Think about defining your mapping upfront, to make ingestion and especially querying and data storage more reliable, when you know what type of a field your are querying.

Hope this helps!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.