How to deal with some indexes failure while backing up


(Guillaume boufflers) #1

https://lh4.googleusercontent.com/-MDqEBFTxCXs/U8kX1Ksm2pI/AAAAAAAAB68/bj5fVXMCW0o/s1600/Screen+Shot+2014-07-18+at+2.48.53+PM.png
Hello guys !

I'm trying to do a backup of my logs using the builtin system.

Here is the command i run :

curl -XPUT
"localhost:9200/_snapshot/backup/180714-5?wait_for_completion=true" -d '{
"partial": "true"
}'

It seems that i can get 113 successful shards backed up but 7 are still
failing and i can't get to know why ..

Here is the ouput:

{

  • "snapshot":{
    • "snapshot":"180714-5",
    • "indices":[
      1. "kibana-int",
      2. "logstash-2014.07.08",
      3. "logstash-2014.07.17",
      4. "logstash-2014.07.01",
      5. "logstash-2014.07.02",
      6. "logstash-2014.07.14",
      7. "logstash-2014.06.30",
      8. "logstash-2014.07.03",
      9. "logstash-2014.07.10",
      10. "logstash-2014.06.27",
      11. "logstash-2014.07.11",
      12. "logstash-2014.07.15",
      13. "logstash-2014.06.29",
      14. "logstash-2014.07.12",
      15. "logstash-2014.07.04",
      16. "logstash-2014.07.16",
      17. "test",
      18. "logstash-2014.07.05",
      19. "logstash-2014.06.28",
      20. "logstash-2014.07.06",
      21. "logstash-2014.07.07",
      22. "logstash-2014.07.09",
      23. "logstash-2014.07.18",
      24. "logstash-2014.07.13"
        ],
    • "state":"SUCCESS",
    • "start_time":"2014-07-18T12:39:56.976Z",
    • "start_time_in_millis":1405687196976,
    • "end_time":"2014-07-18T12:40:42.767Z",
    • "end_time_in_millis":1405687242767,
    • "duration_in_millis":45791,
    • "failures":[
      1. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":0,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      2. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":3,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      3. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":4,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      4. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":1,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      5. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":2,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      6. {
        • "index":"logstash-2014.07.06",
        • "reason":"primary shard is not allocated",
        • "shard_id":0,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      7. {
        • "node_id":"wiNRYfprQXms4s_Y8pBLPw",
        • "index":"logstash-2014.07.06",
        • "reason":"IndexShardMissingException[[logstash-2014.07.06][4]
          missing]",
        • "shard_id":4,
        • "status":"INTERNAL_SERVER_ERROR"
          }
          ],
    • "shards":{
      • "total":120,
      • "failed":7,
      • "successful":113
        }
        }

}

I was wondering why that particular day, i then went to see my kibana
monitor ..
It seems that i have no log for this day ..

https://lh4.googleusercontent.com/-MDqEBFTxCXs/U8kX1Ksm2pI/AAAAAAAAB68/bj5fVXMCW0o/s1600/Screen+Shot+2014-07-18+at+2.48.53+PM.png

Back then i remember that this day, i had an issue with logstash, the
service was down for the whole day ... i don't think it is a coincidence it
is ?
Is there is way to avoid this kind of failure ?
thanks for you time and answers.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/156acf88-01b6-4fd5-96ba-744d4ae02185%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #2

I think that you should have an explanation in nodes logs.

Anything there?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 18 juillet 2014 à 14:51:34, Guillaume boufflers (guillaume.boufflers@gmail.com) a écrit:

Hello guys !

I'm trying to do a backup of my logs using the builtin system.

Here is the command i run :

curl -XPUT "localhost:9200/_snapshot/backup/180714-5?wait_for_completion=true" -d '{
"partial": "true"
}'

It seems that i can get 113 successful shards backed up but 7 are still failing and i can't get to know why ..

Here is the ouput:

{
"snapshot":{
"snapshot":"180714-5",
"indices":[
"kibana-int",
"logstash-2014.07.08",
"logstash-2014.07.17",
"logstash-2014.07.01",
"logstash-2014.07.02",
"logstash-2014.07.14",
"logstash-2014.06.30",
"logstash-2014.07.03",
"logstash-2014.07.10",
"logstash-2014.06.27",
"logstash-2014.07.11",
"logstash-2014.07.15",
"logstash-2014.06.29",
"logstash-2014.07.12",
"logstash-2014.07.04",
"logstash-2014.07.16",
"test",
"logstash-2014.07.05",
"logstash-2014.06.28",
"logstash-2014.07.06",
"logstash-2014.07.07",
"logstash-2014.07.09",
"logstash-2014.07.18",
"logstash-2014.07.13"
],
"state":"SUCCESS",
"start_time":"2014-07-18T12:39:56.976Z",
"start_time_in_millis":1405687196976,
"end_time":"2014-07-18T12:40:42.767Z",
"end_time_in_millis":1405687242767,
"duration_in_millis":45791,
"failures":[
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":0,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":3,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":4,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":1,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":2,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.06",
"reason":"primary shard is not allocated",
"shard_id":0,
"status":"INTERNAL_SERVER_ERROR"
},
{
"node_id":"wiNRYfprQXms4s_Y8pBLPw",
"index":"logstash-2014.07.06",
"reason":"IndexShardMissingException[[logstash-2014.07.06][4] missing]",
"shard_id":4,
"status":"INTERNAL_SERVER_ERROR"
}
],
"shards":{
"total":120,
"failed":7,
"successful":113
}
}
}

I was wondering why that particular day, i then went to see my kibana monitor ..
It seems that i have no log for this day ..

Back then i remember that this day, i had an issue with logstash, the service was down for the whole day ... i don't think it is a coincidence it is ?
Is there is way to avoid this kind of failure ?
thanks for you time and answers.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/156acf88-01b6-4fd5-96ba-744d4ae02185%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53c918b7.1befd79f.38f%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


(Guillaume boufflers) #3

I don't have any log left for that day :confused:

Le vendredi 18 juillet 2014 14:53:22 UTC+2, David Pilato a écrit :

I think that you should have an explanation in nodes logs.

Anything there?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr

Le 18 juillet 2014 à 14:51:34, Guillaume boufflers (guillaume...@gmail.com
<javascript:>) a écrit:

https://lh4.googleusercontent.com/-MDqEBFTxCXs/U8kX1Ksm2pI/AAAAAAAAB68/bj5fVXMCW0o/s1600/Screen+Shot+2014-07-18+at+2.48.53+PM.png
Hello guys !

I'm trying to do a backup of my logs using the builtin system.

Here is the command i run :

curl -XPUT
"localhost:9200/_snapshot/backup/180714-5?wait_for_completion=true" -d '{
"partial": "true"
}'

It seems that i can get 113 successful shards backed up but 7 are still
failing and i can't get to know why ..

Here is the ouput:

{

  • "snapshot":{
    • "snapshot":"180714-5",
    • "indices":[
      1. "kibana-int",
      2. "logstash-2014.07.08",
      3. "logstash-2014.07.17",
      4. "logstash-2014.07.01",
      5. "logstash-2014.07.02",
      6. "logstash-2014.07.14",
      7. "logstash-2014.06.30",
      8. "logstash-2014.07.03",
      9. "logstash-2014.07.10",
      10. "logstash-2014.06.27",
      11. "logstash-2014.07.11",
      12. "logstash-2014.07.15",
      13. "logstash-2014.06.29",
      14. "logstash-2014.07.12",
      15. "logstash-2014.07.04",
      16. "logstash-2014.07.16",
      17. "test",
      18. "logstash-2014.07.05",
      19. "logstash-2014.06.28",
      20. "logstash-2014.07.06",
      21. "logstash-2014.07.07",
      22. "logstash-2014.07.09",
      23. "logstash-2014.07.18",
      24. "logstash-2014.07.13"
        ],
    • "state":"SUCCESS",
    • "start_time":"2014-07-18T12:39:56.976Z",
    • "start_time_in_millis":1405687196976,
    • "end_time":"2014-07-18T12:40:42.767Z",
    • "end_time_in_millis":1405687242767,
    • "duration_in_millis":45791,
    • "failures":[
      1. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":0,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      2. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":3,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      3. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":4,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      4. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":1,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      5. {
        • "index":"logstash-2014.07.07",
        • "reason":"primary shard is not allocated",
        • "shard_id":2,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      6. {
        • "index":"logstash-2014.07.06",
        • "reason":"primary shard is not allocated",
        • "shard_id":0,
        • "status":"INTERNAL_SERVER_ERROR"
          },
      7. {
        • "node_id":"wiNRYfprQXms4s_Y8pBLPw",
        • "index":"logstash-2014.07.06",
        • "reason":"IndexShardMissingException[[logstash-2014.07.06][4]
          missing]",
        • "shard_id":4,
        • "status":"INTERNAL_SERVER_ERROR"
          }
          ],
    • "shards":{
      • "total":120,
      • "failed":7,
      • "successful":113
        }
        }

}

I was wondering why that particular day, i then went to see my kibana
monitor ..
It seems that i have no log for this day ..

https://lh4.googleusercontent.com/-MDqEBFTxCXs/U8kX1Ksm2pI/AAAAAAAAB68/bj5fVXMCW0o/s1600/Screen+Shot+2014-07-18+at+2.48.53+PM.png

Back then i remember that this day, i had an issue with logstash, the
service was down for the whole day ... i don't think it is a coincidence it
is ?
Is there is way to avoid this kind of failure ?
thanks for you time and answers.

Yo

...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8e274a2d-eff7-4b31-a9b0-78604452573a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #4

So. It's somehow hard to understand what happened that day…
Disk full? Running out of file descriptors???

May be you should DELETE that faulty index (07/07/2014) for now as it sounds like all shards are incorrect.

About index 06/07/2104, it sounds like you have some data (1/5).

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 18 juillet 2014 à 14:57:29, Guillaume boufflers (guillaume.boufflers@gmail.com) a écrit:

I don't have any log left for that day :confused:

Le vendredi 18 juillet 2014 14:53:22 UTC+2, David Pilato a écrit :
I think that you should have an explanation in nodes logs.

Anything there?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 18 juillet 2014 à 14:51:34, Guillaume boufflers (guillaume...@gmail.com) a écrit:

Hello guys !

I'm trying to do a backup of my logs using the builtin system.

Here is the command i run :

curl -XPUT "localhost:9200/_snapshot/backup/180714-5?wait_for_completion=true" -d '{
"partial": "true"
}'

It seems that i can get 113 successful shards backed up but 7 are still failing and i can't get to know why ..

Here is the ouput:

{
"snapshot":{
"snapshot":"180714-5",
"indices":[
"kibana-int",
"logstash-2014.07.08",
"logstash-2014.07.17",
"logstash-2014.07.01",
"logstash-2014.07.02",
"logstash-2014.07.14",
"logstash-2014.06.30",
"logstash-2014.07.03",
"logstash-2014.07.10",
"logstash-2014.06.27",
"logstash-2014.07.11",
"logstash-2014.07.15",
"logstash-2014.06.29",
"logstash-2014.07.12",
"logstash-2014.07.04",
"logstash-2014.07.16",
"test",
"logstash-2014.07.05",
"logstash-2014.06.28",
"logstash-2014.07.06",
"logstash-2014.07.07",
"logstash-2014.07.09",
"logstash-2014.07.18",
"logstash-2014.07.13"
],
"state":"SUCCESS",
"start_time":"2014-07-18T12:39:56.976Z",
"start_time_in_millis":1405687196976,
"end_time":"2014-07-18T12:40:42.767Z",
"end_time_in_millis":1405687242767,
"duration_in_millis":45791,
"failures":[
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":0,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":3,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":4,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":1,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.07",
"reason":"primary shard is not allocated",
"shard_id":2,
"status":"INTERNAL_SERVER_ERROR"
},
{
"index":"logstash-2014.07.06",
"reason":"primary shard is not allocated",
"shard_id":0,
"status":"INTERNAL_SERVER_ERROR"
},
{
"node_id":"wiNRYfprQXms4s_Y8pBLPw",
"index":"logstash-2014.07.06",
"reason":"IndexShardMissingException[[logstash-2014.07.06][4] missing]",
"shard_id":4,
"status":"INTERNAL_SERVER_ERROR"
}
],
"shards":{
"total":120,
"failed":7,
"successful":113
}
}
}

I was wondering why that particular day, i then went to see my kibana monitor ..
It seems that i have no log for this day ..

Back then i remember that this day, i had an issue with logstash, the service was down for the whole day ... i don't think it is a coincidence it is ?
Is there is way to avoid this kind of failure ?
thanks for you time and answers.

Yo
...

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8e274a2d-eff7-4b31-a9b0-78604452573a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53c91f6a.257130a3.38f%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


(system) #5