Recovering after shard failure

I had a shard go crazy, it created over 50 thousand files and elasticsearch
was complaining about file limits. So on that machine I deleted everything
for that index in the nodes/0/indices/sessions directory and restarted the
instance, hoping that shards on the other machines wouldn't be lost. After
restarting, the shard on that node stays in unassigned state and the only
thing in the logs is

[2012-06-03 19:52:43,553][DEBUG][gateway.local ] [es-m01a]
[sessions][12]: not allocating, number_of_allocated_shards_found [0],
required_number [1]

I have 20 nodes and use total_shards_per_node = 1 to force even
distribution.

settings: {
index.refresh_interval: 60
index.number_of_replicas: 0
index.number_of_shards: 20
index.routing.allocation.total_shards_per_node: 1
index.cache.field.type: soft
index.version.created: 190499
}

Is the only solution to delete the entire index?

You don't have replicas for the shards, so if you delete the single copy of
a shard, there isn't really another copy for it to recover from.

Regarding the "shard go crazy" and create many files, do you see anything
in the logs?

On Mon, Jun 4, 2012 at 3:20 PM, Andy Wick andywick@gmail.com wrote:

I had a shard go crazy, it created over 50 thousand files and
elasticsearch was complaining about file limits. So on that machine I
deleted everything for that index in the nodes/0/indices/sessions directory
and restarted the instance, hoping that shards on the other machines
wouldn't be lost. After restarting, the shard on that node stays in
unassigned state and the only thing in the logs is

[2012-06-03 19:52:43,553][DEBUG][gateway.local ] [es-m01a]
[sessions][12]: not allocating, number_of_allocated_shards_found [0],
required_number [1]

I have 20 nodes and use total_shards_per_node = 1 to force even
distribution.

settings: {
index.refresh_interval: 60
index.number_of_replicas: 0
index.number_of_shards: 20
index.routing.allocation.total_shards_per_node: 1
index.cache.field.type: soft
index.version.created: 190499
}

Is the only solution to delete the entire index?

On Friday, June 8, 2012 8:25:24 AM UTC-4, kimchy wrote:

You don't have replicas for the shards, so if you delete the single copy
of a shard, there isn't really another copy for it to recover from.

Sorry I wasn't clear, I understand the data is gone. My problem is that
shard refused to go from unassigned to assigned. I had to delete the whole
index. I'm fine with there being a "hole" in my index from all the data
was lost, I was just hoping for a way to not loose everything. Is this
possible, or without replication if you loose a shard you really loose the
whole index?

Regarding the "shard go crazy" and create many files, do you see anything
in the logs?

So it just happen again last night. A single index on a single node went
to over 80k files, while the rest of the nodes have around 800 files for
that same index. This time instead of removing the directory I increased
the max number of open files to 128000 and restarted the cluster
(restarting just the node didn't work.) Eventually it recovered the shard
and now that node has around 1000 files.

Nothing useful in the logs, the first error is too many open files. Since
it seems like it might happen every 5 days is there a particular log I
should turn on to debug, or just all of them?

These are new machines, with three differences from previous setup: 4x the
memory, running 2 nodes per machine instead of 1, and I'm using 2 disks now
for data - path.data: ["/disk1/data", "/disk2/data"]

On Fri, Jun 8, 2012 at 3:05 PM, Andy Wick andywick@gmail.com wrote:

On Friday, June 8, 2012 8:25:24 AM UTC-4, kimchy wrote:

You don't have replicas for the shards, so if you delete the single copy
of a shard, there isn't really another copy for it to recover from.

Sorry I wasn't clear, I understand the data is gone. My problem is that
shard refused to go from unassigned to assigned. I had to delete the whole
index. I'm fine with there being a "hole" in my index from all the data
was lost, I was just hoping for a way to not loose everything. Is this
possible, or without replication if you loose a shard you really loose the
whole index?

Yea, currently, there isn't an option to force the unassigned shards to be
allocated and be empty. The assumption here is that you might be able at
one point to bring back a a node with those shards allocation, so we wait
for it.

Regarding the "shard go crazy" and create many files, do you see anything
in the logs?

So it just happen again last night. A single index on a single node went
to over 80k files, while the rest of the nodes have around 800 files for
that same index. This time instead of removing the directory I increased
the max number of open files to 128000 and restarted the cluster
(restarting just the node didn't work.) Eventually it recovered the shard
and now that node has around 1000 files.

Nothing useful in the logs, the first error is too many open files. Since
it seems like it might happen every 5 days is there a particular log I
should turn on to debug, or just all of them?

These are new machines, with three differences from previous setup: 4x the
memory, running 2 nodes per machine instead of 1, and I'm using 2 disks now
for data - path.data: ["/disk1/data", "/disk2/data"]

Can you gist the lsof -p [es_process_id] output when it happens again? Lets
see what it has open.

On Monday, June 11, 2012 10:50:47 AM UTC-4, kimchy wrote:

On Fri, Jun 8, 2012 at 3:05 PM, Andy Wick andywick@gmail.com wrote:

On Friday, June 8, 2012 8:25:24 AM UTC-4, kimchy wrote:

You don't have replicas for the shards, so if you delete the single copy
of a shard, there isn't really another copy for it to recover from.

Sorry I wasn't clear, I understand the data is gone. My problem is that
shard refused to go from unassigned to assigned. I had to delete the whole
index. I'm fine with there being a "hole" in my index from all the data
was lost, I was just hoping for a way to not loose everything. Is this
possible, or without replication if you loose a shard you really loose the
whole index?

Yea, currently, there isn't an option to force the unassigned shards to be
allocated and be empty. The assumption here is that you might be able at
one point to bring back a a node with those shards allocation, so we wait
for it.

Can I put a feature request in? Or is there a minimum file set I can just
backup and restore if this happens?

Can you gist the lsof -p [es_process_id] output when it happens again?
Lets see what it has open.

I caught it before it ran out of open files at about 40k files. Here are
the first 2000 lines, the rest is more of the same.

All the new files are on /disk2, so seems like the issue might be that I'm
using 2 data paths.

I don't think it is related, because I have a fair number of these, but 3
minutes before the last timestamp in /disk1 I had a bulk exception because
part of my data wasn't clean

[2012-06-12 15:24:04,112][DEBUG][action.bulk ] [moloches-m05a]
[sessions][4] failed to bulk item (index) index {[REMOVED]]}]}
org.elasticsearch.index.mapper.MapperParsingException: Failed to parse [ho]
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:325)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:585)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:573)
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:441)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:311)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.elasticsearch.common.jackson.JsonParseException:
Unrecognized character escape (CTRL-CHAR, code 8)
at [Source: [B@39a06e34; line: 1, column: 228]
at
org.elasticsearch.common.jackson.JsonParser._constructError(JsonParser.java:1433)
at
org.elasticsearch.common.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
at
org.elasticsearch.common.jackson.impl.JsonParserMinimalBase._handleUnrecognizedCharacterEscape(JsonParserMinimalBase.java:496)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._decodeEscaped(Utf8StreamParser.java:2540)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._finishString2(Utf8StreamParser.java:1949)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._finishString(Utf8StreamParser.java:1905)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser.getText(Utf8StreamParser.java:276)
at
org.elasticsearch.common.xcontent.json.JsonXContentParser.text(JsonXContentParser.java:83)
at
org.elasticsearch.common.xcontent.support.AbstractXContentParser.textOrNull(AbstractXContentParser.java:106)
at
org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:242)
at
org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:43)
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:312)
... 12 more

On Tue, Jun 12, 2012 at 10:42 PM, Andy Wick andywick@gmail.com wrote:

On Monday, June 11, 2012 10:50:47 AM UTC-4, kimchy wrote:

On Fri, Jun 8, 2012 at 3:05 PM, Andy Wick andywick@gmail.com wrote:

On Friday, June 8, 2012 8:25:24 AM UTC-4, kimchy wrote:

You don't have replicas for the shards, so if you delete the single
copy of a shard, there isn't really another copy for it to recover from.

Sorry I wasn't clear, I understand the data is gone. My problem is that
shard refused to go from unassigned to assigned. I had to delete the whole
index. I'm fine with there being a "hole" in my index from all the data
was lost, I was just hoping for a way to not loose everything. Is this
possible, or without replication if you loose a shard you really loose the
whole index?

Yea, currently, there isn't an option to force the unassigned shards to
be allocated and be empty. The assumption here is that you might be able at
one point to bring back a a node with those shards allocation, so we wait
for it.

Can I put a feature request in? Or is there a minimum file set I can just
backup and restore if this happens?

Yea, you can. A quick hack can be to create a similar empty index locally
and copy over the relevant missing shard to a node data location, and
create a dummy index (just to get relocation happening again).

Can you gist the lsof -p [es_process_id] output when it happens again?
Lets see what it has open.

I caught it before it ran out of open files at about 40k files. Here are
the first 2000 lines, the rest is more of the same.

ElasticSearch many files · GitHub

All the new files are on /disk2, so seems like the issue might be that I'm
using 2 data paths.

  • Do you have any special configuration on the sessions index? Custom merge
    policy settings perhaps?
  • Which ES version are you using?
  • Can you try with one data path and see if the problem goes away? (I don't
    think that thats the problem).

I don't think it is related, because I have a fair number of these, but 3
minutes before the last timestamp in /disk1 I had a bulk exception because
part of my data wasn't clean

[2012-06-12 15:24:04,112][DEBUG][action.bulk ]
[moloches-m05a] [sessions][4] failed to bulk item (index) index
{[REMOVED]]}]}
org.elasticsearch.index.mapper.MapperParsingException: Failed to parse [ho]
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:325)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:585)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:573)
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:441)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:311)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:156)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.elasticsearch.common.jackson.JsonParseException:
Unrecognized character escape (CTRL-CHAR, code 8)
at [Source: [B@39a06e34; line: 1, column: 228]
at
org.elasticsearch.common.jackson.JsonParser._constructError(JsonParser.java:1433)
at
org.elasticsearch.common.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
at
org.elasticsearch.common.jackson.impl.JsonParserMinimalBase._handleUnrecognizedCharacterEscape(JsonParserMinimalBase.java:496)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._decodeEscaped(Utf8StreamParser.java:2540)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._finishString2(Utf8StreamParser.java:1949)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser._finishString(Utf8StreamParser.java:1905)
at
org.elasticsearch.common.jackson.impl.Utf8StreamParser.getText(Utf8StreamParser.java:276)
at
org.elasticsearch.common.xcontent.json.JsonXContentParser.text(JsonXContentParser.java:83)
at
org.elasticsearch.common.xcontent.support.AbstractXContentParser.textOrNull(AbstractXContentParser.java:106)
at
org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:242)
at
org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:43)
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:312)
... 12 more

Its simply a parser failure. But this message is strange failed to bulk
item (index) index {[REMOVED]]}]}? Did you remove thigns there?

.

ElasticSearch many files · GitHub

All the new files are on /disk2, so seems like the issue might be that
I'm using 2 data paths.

  • Do you have any special configuration on the sessions index? Custom
    merge policy settings perhaps?

I don't think so

"sessions" : {
"settings" : {
"index.analysis.filter.url_stop.stopwords.1" : "https",
"index.analysis.filter.url_stop.stopwords.0" : "http",
"index.refresh_interval" : "60",
"index.version.created" : "190499",
"index.number_of_shards" : "20",
"index.analysis.filter.url_stop.type" : "stop",
"index.analysis.analyzer.url_analyzer.filter.1" : "url_stop",
"index.analysis.analyzer.url_analyzer.filter.0" : "stop",
"index.analysis.analyzer.url_analyzer.type" : "custom",
"index.number_of_replicas" : "0",
"index.analysis.analyzer.url_analyzer.tokenizer" : "lowercase",
"index.routing.allocation.total_shards_per_node" : "1",
"index.cache.field.type" : "soft"
}
}

  • Which ES version are you using?

0.19.4

  • Can you try with one data path and see if the problem goes away? (I
    don't think that thats the problem).

Yes, I'll try that.

I don't think it is related, because I have a fair number of these, but 3
minutes before the last timestamp in /disk1 I had a bulk exception because
part of my data wasn't clean

Its simply a parser failure. But this message is strange failed to bulk
item (index) index {[REMOVED]]}]}? Did you remove thigns there?

Yes, I can't share the actual data, sorry I wasn't more clear.

Thanks,
Andy