Isolate data on a single node


#1

Hi,

I'm trying to copy a production index onto a single node in a separate cluster. This is so that I can experiment with data types and attempt to improve performance of certain queries.

I've tried a few approaches but run into trouble each time.

1. Restore from snapshot

With a fresh cluster, populate _snapshot by pointing it at our S3 snapshot repository, wait for it to populate, then restore an index. Unfortunately this has no effect. The request to restore immediately returns with:
{"snapshot":{"snapshot":"20150603","indices":[],"shards":{"total":0,"failed":0,"successful":0}}}

2. Connect to existing cluster, detach, then rebalance

I thought that by connecting to the main cluster, then detaching, I'd be able to separate this node with a copy of all or some of the data I need. All I get however is a bunch of unallocated shards that I can't access. I was hoping ES might rebalance, but no luck.

3. Connect to existing cluster, detach, then restore from snapshot

A combination of the above. Again, no luck. I specify wait_for_completion=true and the request doesn't return, which suggests a restore is in progress, but again nothing happens. No CPU, disk or network activity, no matter how long I leave it.

Is it possible to isolate a snapshotted index on its own node? Separate to the original cluster it came from? What's the correct way of doing this?


(Otis Gospodnetić) #2

Could https://github.com/allegro/elasticsearch-reindex-tool be of use here?

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


#3

Thanks Otis! That tool's proven to be pretty useful. One of the type's I'm copying over has each document routed which it seems to be tripping on, unfortunately, so this might not work out. It appears there are alternatives I can try though.


(system) #4