Currently we are running elasticsearch 5.5 on AWS elasticsearch.. We want to move to elastic.co.
The size of the data was around 28 GB. We deleted around 6600 K docs using delete by query.
But the size remained the same at 28GB. So no point in taking backup and restoring in new index on elastic.co since the backup would bring along with it the junk documents.
So we are planning to do a reindex.
Can i separately do _reindex type by type like below...
POST _reindex
{
"source": {
"index": "oldindex",
"type": "oldtype1"
},
"dest": {
"index": "new_index"
}
}
and then
POST _reindex
{
"source": {
"index": "oldindex",
"type": "oldtype2"
},
"dest": {
"index": "new_index"
}
}
I read that forcemerge is not recommended on active node... in fact, we should allow elasticsearch to naturally remove them is what I read somewhere... so, I thought of reindexing...
Depends what you mean by active node. If you create new indices each month / day - and old indices are no longer updated or added to then forcemerging is fine. Noting it can take a long time.
Alternatively if you have active indexes deleted documents should be removed automagically by elastic as new documents are indexed over time (force merging is not recommended for these indices).
One more question.. can my application be running and the source elasticsearch domain in use, while reindexing is going on? Will it affect reindexing process or source data if any requests came in during reindexing?
Yes you can continue using the existing index when reindexing it into another index. If you read the reindexing documentation it says it takes a snapshot of the existing index when you kick off the reindexing. So what that means is changes made after you start the reindexing will not be brought across.
What you can do is pull all of the current documents across. Then perform another quick reindexing with a query that just targets the changes since the first run - when you are ready to migrate.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.