Portability of Index on disk


#1

Hello, I am a new Elasticsearch user. I have been using wikipedia index dumps to run some queries more or less based on this tutorial - https://www.elastic.co/blog/loading-wikipedia

Things are working great! I am using a very simple single shard no replication instance. Which works perfectly fine for my purposes. Now I have managed to shrink the wiki indexes provided, from multiple GB to a few hundred MB that my queries require by playing around with the mappings settings. I want to share this index with others at work.

Goal being they can follow the exact same steps in the tutorial WITHOUT the zcat / bulk import step which takes about 3 hours to run. I figure if I have already created the appropriate index on disk just copying over the few hundred MB will make their setup time much much faster.

Is there some way to achieve this? Thanks in advance for any pointers!


(Gabriel Tessier) #2

One suggestion that we use to share data on same local network is remote reindex.
And with filtering by query, other co-worker can take only the data they need.

More about here:
https://www.elastic.co/guide/en/elasticsearch/reference/master/reindex-upgrade-remote.html

Hope it help.


(Makoto Nozawa) #3

I think, snapshot is portable data and restorable by another cluster.

https://www.elastic.co/guide/en/elasticsearch/reference/6.x/modules-snapshots.html


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.