Reindex with Scan-scroll and bulk_API


(krishna singh) #1

I need to reindex my data. I was able to get a scroll_id using the scan and scroll API. Now I need to insert those data to my new index, but can't figure out how to use the bulk API to do this.

Please Give some Example.


(David Pilato) #2

Did you read this? https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html


(krishna singh) #3

Thanxs..!!!but i have read this but not getting exact result.

PUT /my_index_1
PUT /my_index_1/_mapping/data
{ "data":{"properties":{ "counter":{"type":"string"}}}
}
PUT /my_index_1/data/2
{
"counter":2
}
GET /my_index_1/_mapping
GET /my_index_1/_search

POST /_aliases
{
"actions": [
{ "remove": {
"alias": "my_index_docs",
"index": "my_index_1"
}},
{ "add": {
"alias": "my_index_docs",
"index": "my_index_2"
}}
]
}

PUT /my_index_2

PUT /my_index_2/_mapping/test
{
"test":{"properties": { "counter":{"type":"long"}
}}
}

GET /my_index_2/test/_search
GET /my_index_2/_mapping

This is my code.and problem is that am not getting new mapping or data in my_index_2 my new index....please help me out.


(David Pilato) #4

I don't see in your example when you read data from the first index and then bulk in the second one.


(krishna singh) #5

how to bulk data in new index.i am new in elasticsearch..please help me out


(krishna singh) #6

can you give me some example of when you read data from the first index and then bulk in the second one.


(David Pilato) #7

Well. Not really.

I think that it mostly depends on your technology (Java, PHP, Python... whatever).
Almost all clients (but Java) already have a reindex method.

Here is an example on how I'm doing it using Logstash: http://david.pilato.fr/blog/2015/05/20/reindex-elasticsearch-with-logstash/ but I guess you are looking for something else.

You said that:

Then on the client side, you have to prepare a bulk request which does what you need and send it.

POST /_bulk
{ "index":  { "_index": "my_index_2", "_type": "test" }}
{ your doc 1 here" }
{ "index":  { "_index": "my_index_2", "_type": "test" }}
{ your doc 2 here" }
{ "index":  { "_index": "my_index_2", "_type": "test" }}
{ your doc 3 here" }

(krishna singh) #8

sorry,this is not my problem...Actually I want to change my mapping of existing data.
please go through this link.Is It possible change mapping of existing data??

PUT /my_index_1
PUT /my_index_1/_mapping/data
{ "data":{"properties":{ "counter":{"type":"string"}}}
}
PUT /my_index_1/data/2
{
"counter":2
}
GET /my_index_1/_mapping
GET /my_index_1/_search

POST /_aliases
{
"actions": [
{ "remove": {
"alias": "my_index_docs",
"index": "my_index_1"
}},
{ "add": {
"alias": "my_index_docs",
"index": "my_index_2"
}}
]
}

PUT /my_index_2

PUT /my_index_2/_mapping/test
{
"test":{"properties": { "counter":{"type":"long"}
}}
}

GET /my_index_2/test/_search
GET /my_index_2/_mapping


(David Pilato) #9

So the blog post I linked totally answers to this question.

Also in the link you just pasted, here is what is written:

Then, pull the documents in from your old index, using a scrolled search and index them into the new index using the bulk API. Many of the client APIs provide a reindex() method which will do all of this for you. Once you are done, you can delete the old index.

This is exactly what I meant with my previous answers...


#10

@krish0608 It is my understanding that if you want to change the mapping of an index, you need to create a new index with the desired mapping and then index your data into the new index. In other words, you cannot change the mapping of an index that already has data in it.


(Hanan) #11

Hi there,

I am nebiew to Elasticsearch, and try to re-index data for existing index using scroll & bulk api but bulk api does not work for me

i have tried scroll as per url below, but not sure how to index the returned documents in new index which contains the new index mappings? any idea?
https://www.elastic.co/guide/en/elasticsearch/guide/current/reindex.html


(system) #12