Reindex into another Elasticsearch


(Frederico Ferreira) #1

This is my first e-mail, so, if this problem is already explained, i'm
sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master
(data false) and 10 slaves, and every index configured by day (from
logstash). Since we changed to a hourly index, after 2 weeks and a needed a
maintenance reboot, Elasticsearch wasn't able to start properly. It started
assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of
our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas
indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to
reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch
      to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards
    folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core,
16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

1 shard per index doesn't make a lot of sense unless you have very small
amounts of data, You'd be better off going back to the default as you are
solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira fredericobf@gmail.com wrote:

This is my first e-mail, so, if this problem is already explained, i'm
sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master
(data false) and 10 slaves, and every index configured by day (from
logstash). Since we changed to a hourly index, after 2 weeks and a needed a
maintenance reboot, Elasticsearch wasn't able to start properly. It started
assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of
our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas
indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to
reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need
      Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from
    shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core,
16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Frederico Ferreira) #3

I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed
those folders and sent to S3.
But, now, we just found an "answer":

  • we built (at another dc) another ES cluster and we've put those
    folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for

Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom markwalkom@gmail.com:

1 shard per index doesn't make a lot of sense unless you have very small
amounts of data, You'd be better off going back to the default as you are
solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira fredericobf@gmail.com
wrote:

This is my first e-mail, so, if this problem is already explained, i'm
sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master
(data false) and 10 slaves, and every index configured by day (from
logstash). Since we changed to a hourly index, after 2 weeks and a needed a
maintenance reboot, Elasticsearch wasn't able to start properly. It started
assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of
our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas
indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to
reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need
      Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from
    shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core,
16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3i2FzNpBdrtqV6MOOqUSMd7x-FEi_ZEgA38V1CZOnJBpw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #4

You're better off using the snapshot and restore functionality than doing
your method.

I'm not sure what you are trying to do though

On 15 May 2015 at 09:49, Frederico Ferreira fredericobf@gmail.com wrote:

I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed
those folders and sent to S3.
But, now, we just found an "answer":

  • we built (at another dc) another ES cluster and we've put those
    folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for

Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom markwalkom@gmail.com:

1 shard per index doesn't make a lot of sense unless you have very small
amounts of data, You'd be better off going back to the default as you are
solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira fredericobf@gmail.com
wrote:

This is my first e-mail, so, if this problem is already explained, i'm
sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master
(data false) and 10 slaves, and every index configured by day (from
logstash). Since we changed to a hourly index, after 2 weeks and a needed a
maintenance reboot, Elasticsearch wasn't able to start properly. It started
assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration
of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas
indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to
reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need
      Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from
    shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core,
16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3i2FzNpBdrtqV6MOOqUSMd7x-FEi_ZEgA38V1CZOnJBpw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3i2FzNpBdrtqV6MOOqUSMd7x-FEi_ZEgA38V1CZOnJBpw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X89JbL0T1GwoLct%3DS4uK6EPS_EZLf538raqOH5pxq2ggQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Bitsof Info) #5

Also try this tool for more easily aggregating FS repo snapshots across a
cluster for restoring on a different cluster. I had to make this tool for a
similar scenario I had, might help in your situation
too https://github.com/bitsofinfo/elasticsearch-snapshot-manager

On Thursday, May 14, 2015 at 5:50:33 PM UTC-6, Frederico Barnard wrote:

I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed
those folders and sent to S3.
But, now, we just found an "answer":

  • we built (at another dc) another ES cluster and we've put those
    folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for

Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom <markw...@gmail.com <javascript:>>:

1 shard per index doesn't make a lot of sense unless you have very small
amounts of data, You'd be better off going back to the default as you are
solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira <frede...@gmail.com
<javascript:>> wrote:

This is my first e-mail, so, if this problem is already explained, i'm
sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master
(data false) and 10 slaves, and every index configured by day (from
logstash). Since we changed to a hourly index, after 2 weeks and a needed a
maintenance reboot, Elasticsearch wasn't able to start properly. It started
assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration
of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas
indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to
reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need
      Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from
    shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core,
16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c6feaec-276d-4fc2-8600-244acd1d1571%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6