Elastic: Problem with decimal format when indexing from the 64th document (document not indexed)

oldiequattro · December 3, 2020, 9:03pm

Hello,

I am showing you here a problem encountered in our environments with the indexing of a row in Elasticsearch which is not done as soon as the decimal value contained in the key exceeds 64 (side effect of a base 64 connector?).

Context: replication of data from a DB2 Z / os to a Kafka (confluence) and indexing in an Elasticsearch 7.6.2 via a connector (Kafka connect confluence).

Data: line containing a key composed of:

1 column CHAR format
1 column decimal format
1 column CHAR format
Example:

PAUL 1 DUBOIS ADDRESS 1 BOULANGER
PAUL 2 DUBOIS ADDRESS 2 BOUCHER
PAUL 3 DUBOIS ADDRESS 2 PLOMBIER
... / ...
PAUL 65 DUBOIS ADDRESS 65 JOINER

Attempt to insert more than 64 rows with the same key except the decimal column.

Description of the problem: from the 64th line on no longer finds the record in Elasticsearch.
The lines> to 64 are however extracted (via the capture of modified data) from the DB2 Z / os database, well written in the Kafka topic (verified) but they are not found in Elasticsearch after indexing via Kafka connect.

The problem therefore occurs in the chain at indexing. The Kafka connector does not display any rejects or warnings.
Elasticsearch isn't meaningful either.

Test done several times with different records but always with col1 and col3 of the same key and col2 incremented (1,2,3,4 etc ... up to more than 64).

Note:
The functional key used in the connector definition is the one indicated above.
But the internal ID generated by Elasticsearch still has a key technique.
Is this the source of the problem?
If so, how to get around it?

Other opinions?
Thanks for your help.

warkolm · December 4, 2020, 1:37am

Can you post the error that you are getting?

oldiequattro · December 4, 2020, 10:14am

Hello Warkolm,

The difficulty is that i don't have any error in the Kafka-connect log or in the Elasticsearch log.
It's the problem.
I have noticed that this behaviour occurs from the 64th rows with multiple tests.

dadoonet · December 4, 2020, 10:17am

I can see that you also opened the question en français at Elastic : Problème avec format décimal à l'indexation (document non indexé).

Where do you prefer the discussion to happen? I asked you for some details there. Should I ask here instead and close the other question?

oldiequattro · December 5, 2020, 1:40am

yes thank you to delete the french one.

dadoonet · December 5, 2020, 4:43am

We had a lot of discussion there so I think we should instead close this one and continue there.

system · January 2, 2021, 4:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic : Problème avec format décimal à l'indexation (document non indexé) Discussions en français	10	1149	January 4, 2021
Jackson numeric value JsonParseException - out of range Elasticsearch	3	4073	July 6, 2017
Missing 75k documents due to exceptions Elasticsearch	6	827	July 6, 2017
ElasticSearch + CouchDB + BIG_INTEGER = Oh My Elasticsearch	3	435	July 6, 2017
Long JSON number lost precision Elasticsearch	14	4622	March 3, 2017

Elastic: Problem with decimal format when indexing from the 64th document (document not indexed)

Related topics