Elasticsearch Hadoop id mapping can't handle big number?

Hi,

I save RDD to Elasticsearch via Scala like this:

EsSpark.saveJsonToEs(rdd, index, Map(
"es.mapping.id" -> "IncrementalId"))

When the IncrementalId is small number (like less than 1 billion), the id mapping is fine. But when the IncrementalId is a lot bigger (generated using Snowflake algo), then it became wrong, like this:

IncrementalId        |  Mapped Id in Es ( _id )
106640324646928380   | "106640324646928384"
106640324661608450   | "106640324661608448"
106640324677337090   | "106640324677337088"

Just a few number bigger/smaller

Is it possible Es not being able to handle big number like this when doing the id mapping?

What did you map it as?

Hi, IncrementalId is a big int, and I did not specify any other mapping, the meta "_id" value is string.

I did the test for over 10,000 documents, they all have the same problem. I assume you can simply reproduce it by specifying a big integer number like 106640324646928380 and use "es.mapping.id" to map to the unique _id meta value and see.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.