Python client scan/scroll intermittent shard failure for BIG_INTEGER (throws IllegalStateException)

geebee · March 22, 2016, 9:35pm

I am using the python library to do a scan/scroll of an index with ~12 million documents, do some work on them, and then bulk index them into a new index.

I run this script nightly on a cron job and sometimes (approximately 10% of the time), I get the following error: "ElasticsearchIllegalStateException[No matching token for number_type [BIG_INTEGER]]" during the scan.

There is no corresponding error in the server logs, and none of the documents being scanned have mapped fields that are the "long" (which corresponds to BIG_INTEGER I believe) type.

I have investigated the documents that are failing during the scan by ID, and retrieving them by ID directly or by query on the _id field yields no issues.

I have also played around with the scroll time and fetch size parameters to no avail.

I am completely out of ideas and would be very grateful if anyone had any insight, or could point me in the right direction. Thanks!

Topic		Replies	Views
Just Pushed: Search Scan Type for effecient large hit set scanning Elasticsearch	14	451	July 6, 2017
String index out of range: -1 Exception Elasticsearch	5	1078	July 6, 2017
Illegal State Exception Elasticsearch	1	490	July 6, 2017
Apparent scroll timeout error Elasticsearch	7	3123	July 6, 2017
Scan search type returning fewer than expected records Elasticsearch	4	389	July 6, 2017

Python client scan/scroll intermittent shard failure for BIG_INTEGER (throws IllegalStateException)

Related topics