I have a very large Index in ES (millions of entries over 100s of GB) with
too few Shards.
And I want only to create the same Index but with more, smaller, efficient
Shards.
And I would love it if this could happen quickly (minutes?)
My googling indicates that the elasticsearch-reindex Plugin
(https://github.com/karussell/elasticsearch-reindex) is the best approach.
Is that still the case??
Under the covers elasticsearch-reindex uses the recommended scan-and-scroll
technique.
But reading the doc/code in the elasticsearch-reindex Plugin, they
indicate that perhaps they should have implemented it as a River.
And I’ve not been able to find a River that accomplished this same task.
It seems that this might be a pretty common thing to need to do, and that
there may be a very efficient way to accomplish it that I am missing.
And it seems that it might be fastest to have some sort of very fast
streaming process that could essentially stream the entries of indexOld to
indexNew with a new number of Shards.
Is elasticsearch-reindex still the best tool for this job??
(I am on ES 1.0.2)
maybe there is no need to reindex? Cant you just index all your new data
into another index and use an alias to search on both indices? You might
want to watch https://www.youtube.com/watch?v=gBOhCNcjC7k
I have a very large Index in ES (millions of entries over 100s of GB) with
too few Shards.
And I want only to create the same Index but with more, smaller, efficient
Shards.
And I would love it if this could happen quickly (minutes?)
But reading the doc/code in the elasticsearch-reindex Plugin, they
indicate that perhaps they should have implemented it as a River.
And I’ve not been able to find a River that accomplished this same task.
It seems that this might be a pretty common thing to need to do, and that
there may be a very efficient way to accomplish it that I am missing.
And it seems that it might be fastest to have some sort of very fast
streaming process that could essentially stream the entries of indexOld to
indexNew with a new number of Shards.
Is elasticsearch-reindex still the best tool for this job??
(I am on ES 1.0.2)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.