Copy data of one index into other index

Hi All,

I have two indexes(lets say tweetindex, tweetindex1) with same mapping and
same index type. I am copying all shards of first index into second index
directly.

For copying shards i am copying shards directories as:(for shard 0)
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex/0 -->
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex1/0

After restarting elasticsearch i am able to see data in second index.

But i do not want to restart elasticsearch. I have tried to
refresh,optimize the index so that data can be seen in second index but no
data was seen through rest client after refreshing and optimizing the index.

Is it possible to see data by directly copying in other index without
restart of elasticsearch ?

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Direct copying of indices seems like a huge anti-pattern. A complete no-no.
A top "DO NOT DO THIS". Here's why I think this:
When you copy an index you leave behind all the meta-data that made the
index interesting--the analyzers, transaction log, etc. You have bypassed
all of the nice machinery that was put into place expressly to handle this
scenario, ie. replication, which is the supported way to accomplish this.
(Well, it obviously doesn't do what you want so you are trying to bend
around it).

Perhaps the elasticsearch development folks could weigh in on this. But if
they ARE going to support this scenario a lot needs to be done as you have
discovered (what about updates that happen during the copy??). Rather than
writing JIRAs saying "support automatic discovery of copied indices" I
think it makes more sense to say: don't do that.

A plugin that takes data from one index and uses the actual APIs to insert
into a new one makes sense. Did not Jorg Prante write this?

On Wed, Aug 28, 2013 at 1:01 AM, Hanish Bansal <
hanish.bansal.agarwal@gmail.com> wrote:

Hi All,

I have two indexes(lets say tweetindex, tweetindex1) with same mapping and
same index type. I am copying all shards of first index into second index
directly.

For copying shards i am copying shards directories as:(for shard 0)
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex/0 -->
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex1/0

After restarting elasticsearch i am able to see data in second index.

But i do not want to restart elasticsearch. I have tried to
refresh,optimize the index so that data can be seen in second index but no
data was seen through rest client after refreshing and optimizing the index.

Is it possible to see data by directly copying in other index without
restart of elasticsearch ?

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes, there is a way. Create an empty index, close it. Then copy shards on
file level. Open the index and you are done without restart. But for
creating you should look at metadata (by API), meaning get setting and
mapping and use it for creation. That gives you a one to one copy without
downtime.

Hope that helps, works fine for me :wink:
Andrej

Am Mittwoch, 28. August 2013 10:01:18 UTC+2 schrieb Hanish Bansal:

Hi All,

I have two indexes(lets say tweetindex, tweetindex1) with same mapping and
same index type. I am copying all shards of first index into second index
directly.

For copying shards i am copying shards directories as:(for shard 0)
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex/0 -->
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex1/0

After restarting elasticsearch i am able to see data in second index.

But i do not want to restart elasticsearch. I have tried to
refresh,optimize the index so that data can be seen in second index but no
data was seen through rest client after refreshing and optimizing the index.

Is it possible to see data by directly copying in other index without
restart of elasticsearch ?

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I'm curious what the use case is? Why do you need to perform this copy
operation?

If you need to rename an index without disrupting your application (or
pushing new code, etc), you can use Aliases to atomically swap around index
names. If you need to use data from multiple indices, you can easily
search both indices at the same time by adding both indices in the URI.

On Wednesday, August 28, 2013 4:01:18 AM UTC-4, Hanish Bansal wrote:

Hi All,

I have two indexes(lets say tweetindex, tweetindex1) with same mapping and
same index type. I am copying all shards of first index into second index
directly.

For copying shards i am copying shards directories as:(for shard 0)
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex/0 -->
elasticsearch-0.90.0.RC2/data/eshh/nodes/0/indices/tweetindex1/0

After restarting elasticsearch i am able to see data in second index.

But i do not want to restart elasticsearch. I have tried to
refresh,optimize the index so that data can be seen in second index but no
data was seen through rest client after refreshing and optimizing the index.

Is it possible to see data by directly copying in other index without
restart of elasticsearch ?

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks all for your valuable response !!

My actual use case to use this scenario while reindexing the data. I have
indexes with millions records. An index may have 4 or more shards depends
on requirement. I do not want to reindex entire data of that index because
that may be heavy operation. If any shard's some data got corrupted then i
ll use below steps:

  • Create new index with same mapping
  • copy available shards to new index
  • read corrupted shard's data and reindex that.

Thanks Andrej, this solution is working. :slight_smile:

On Wed, Aug 28, 2013 at 10:39 PM, Zachary Tong zacharyjtong@gmail.comwrote:

I'm curious what the use case is? Why do you need to perform this copy
operation?

If you need to rename an index without disrupting your application (or
pushing new code, etc), you can use Aliases to atomically swap around index
names. If you need to use data from multiple indices, you can easily
search both indices at the same time by adding both indices in the URI.

On Wednesday, August 28, 2013 4:01:18 AM UTC-4, Hanish Bansal wrote:

Hi All,

I have two indexes(lets say tweetindex, tweetindex1) with same mapping
and same index type. I am copying all shards of first index into second
index directly.

For copying shards i am copying shards directories as:(for shard 0)
elasticsearch-0.90.0.RC2/data/**eshh/nodes/0/indices/**tweetindex/0 -->
elasticsearch-0.90.0.RC2/data/**eshh/nodes/0/indices/**tweetindex1/0

After restarting elasticsearch i am able to see data in second index.

But i do not want to restart elasticsearch. I have tried to
refresh,optimize the index so that data can be seen in second index but no
data was seen through rest client after refreshing and optimizing the index.

Is it possible to see data by directly copying in other index without
restart of elasticsearch ?

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks & Regards
Hanish Bansal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.