_id value is getting changed in elastic

gbandasha · March 31, 2021, 12:41pm

I have a unique_id value in my index and that is also used as the document_id, but as my document count is increasing drastically I had changed the type from double because that was causing truncation of my values. I changed it to data type "keyword" as many elastic articles suggest that we should only use numeric datatype if we are going to do range searches, aggregation, etc.. which I am not performing on the unique_id field. But after that, I notice that it is changing the whole value, for example, I send the value "432375692312511746" and in elastic doc, it shows up as "795807590142069243".

I would like to understand why is this happening and also which data type to use for such long numbers and not get into the problem of truncation and change values.

gbandasha · April 5, 2021, 9:47pm

waiting on the community for response.

stephenb · April 5, 2021, 10:40pm

Hi @gbandasha

Can you please provide the following

The version of elastic you're working with

The mapping for the document

A sample input document

Then what the document looks like after it has been indexed / when you GET it back.

And point out the differences or changes.

gbandasha · April 6, 2021, 11:20pm

Sure, please find the details below

v 7.11.2.
I am using the legacy mapping template and the field that I am facing an issue with is unique_id and the mapping for that is attached below in the sinnpet.

Sample JSON data.

{"unique_id" : "432375692312511746","client_name" : "test"}

In Kibana Discover and even the JSON tab it should up as

{"unique_id" : "795807590142069243","client_name" : "test"}

Let me know if you need anything else

stephenb · April 6, 2021, 11:49pm

Hmmmm I just ran this on 7.11.1 and 7.12.0 (I didn't have a 7.11.2 handy)
Are you sure you are retrieving the same documents?

DELETE test

PUT /test
{
  "mappings": {
    "properties": {
      "name" : {"type" : "keyword"},
      "unique_id": {
        "type" : "keyword",
        "eager_global_ordinals": false,
        "norms": false,
        "index": true,
        "store": false,
        "index_options": "docs",
        "split_queries_on_whitespace" : false,
        "doc_values": true
      }
    }
  }
}



POST test/_doc
{
  "name" : "stephen",
  "unique_id" : "432375692312511746"
}

POST test/_doc
{
  "name" : "jeffery",
  "unique_id" : "1293847209184720193847"
}


POST test/_doc
{
  "name" : "dude",
  "unique_id" : "09870987356409586734059867"
}

Results they all look correct to me. and In Discover and Kibana

GET test/_search

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "scmTqXgBZuwJvuVN3t8C",
        "_score" : 1.0,
        "_source" : {
          "name" : "stephen",
          "unique_id" : "432375692312511746"
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "ssmTqXgBZuwJvuVN3t8m",
        "_score" : 1.0,
        "_source" : {
          "name" : "jeffery",
          "unique_id" : "1293847209184720193847"
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "s8mTqXgBZuwJvuVN3t9J",
        "_score" : 1.0,
        "_source" : {
          "name" : "dude",
          "unique_id" : "09870987356409586734059867"
        }
      }
    ]
  }
}

gbandasha · April 7, 2021, 1:16am

Thanks, @stephenb for the quick response and you are right if I run through DEV tools to add a record or even If I try like 10 - 20 records through logstash it works but when I have like millions of doc's it changes the value.

stephenb · April 7, 2021, 1:25am

Elasticsearch shouldn't be changing values for no reason, that is not to say there might be some strange defect but lets check a few other things.

Perhaps it could be something in your logstash pipeline when it's getting overwhelmed or wrong type setting there.

I am curious about a few of the non default settings, if you just make it a keyword without any of the other settings do you see the same behavior? i.e. just

DELETE test

PUT /test
{
  "mappings": {
    "properties": {
      "name" : {"type" : "keyword"},
      "unique_id": { "type" : "keyword"}
    }
  }
}

Also silly question how do you know you're lining up the input document with what you're seeing in elastic. Is there some other unique ID? in other words how do you know you're comparing to the same documents from source to elasticsearch when you have millions?

gbandasha · April 7, 2021, 6:45pm

@stephenb I will try the default settings and let you know.

Regarding the second question, I am dealing with transactional data and I have many other elements like account_number, transaction date time, and the amount that helps me drill down to that specific record to compare.

stephenb · April 7, 2021, 7:00pm

Ok good to know when you inspect the document in kafka is the unique_id still correct?

Is there any processing between kafka an Elasticsearch?

gbandasha · April 7, 2021, 8:06pm

In Kafka there is no processing and it is correct I checked the topic data and compared it.

stephenb · April 7, 2021, 9:46pm

What it between Kafka and Elasticsearch and did you try the default mapping?

There is something going on... but if elasticsearch has randomly changing keyword values I think we would be getting a lot of reports on that. Lets keep looking.

gbandasha · April 12, 2021, 5:40pm

@stephenb I set up the default mapping and resent the data through logstash and now the IDs are not getting changed.

Thanks a lot for your help.

stephenb · April 12, 2021, 5:52pm

Good to know... interesting..., thanks for letting us know it is working.

system · May 10, 2021, 5:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES is rounding keyword values Elasticsearch	9	1052	September 5, 2018
Adding id with long type changes the actual value Elasticsearch	8	1200	June 27, 2017
# number value change when insert 11258999069452421 Elastic Search	5	33	January 27, 2025
Number is being changed Elasticsearch	7	775	October 3, 2017
Long data type of es confuses things Elasticsearch	3	750	June 13, 2017

_id value is getting changed in elastic

Related topics