How to create auto generated ids while reindexing?

On using reindex API for reindexing 5 TB of data, to make the process faster I would like to create auto generated ids for new index as mentioned here.How do we do it??

https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html#_disable_swapping

Hi @praveengadugin1 ,

As the name suggest auto-generated is automatic, so you just need to remove the id field that you have and reindex your data elastic will assign an auto-generated one in "_id" field.

Check about ingest to remove the id field if you have one.

You can try:

POST my_index/_doc
{"foo": "bar"}

it will return:

{
  "_index" : "my_index",
  "_type" : "_doc",
  "_id" : "X80Qu2sBMcYBL_rB5lGJ",
  "_version" : 1,
  "result" : "created", 

the auto generated id is "X80Qu2sBMcYBL_rB5lGJ"

Hope it help.

2 Likes

I am linking my old post of same
Try

POST _reindex
{
  "source": {
    "index": ["twitter", "twitter2"]
  },
  "dest": {
    "index": "new_twitter"
  },
  "script": {
    "source": "ctx._id=null",
    "lang": "painless"
  }
}
2 Likes

Hi,

With TB of data I guess that ingest may be faster than script but better to make your own bench and share it here it may be useful for other people.

About ingest you can use remove processor to remove the id field.
https://www.elastic.co/guide/en/elasticsearch/reference/current/remove-processor.html

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.