[QA] Infinite loop when catching failure due to file path error


(Pascal P. Pochet) #1

When using Java API and starting an embedded server,
we accidentally discovered this behavior :

If for any reason you have an error in the path defined by

gateway.fs.location

Your application will of course fail if the path is unaccessible but also will be stuck in an infinite loop logging tons of:

Error injecting constructor, java.io.IOException: Failed to obtain node lock
at org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:47)
while locating org.elasticsearch.env.NodeEnvironment
for parameter 6 at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.(TransportNodesListShardStoreMetaData.java:72)
while locating org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData
for parameter 2 at org.elasticsearch.gateway.blobstore.BlobReuseExistingNodeAllocation.(BlobReuseExistingNodeAllocation.java:68)
while locating org.elasticsearch.gateway.blobstore.BlobReuseExistingNodeAllocation
while locating org.elasticsearch.cluster.routing.allocation.NodeAllocation annotated with @org.elasticsearch.common.inject.multibindings.Element(setName=,uniqueId=9)
at org.elasticsearch.cluster.routing.allocation.ShardAllocationModule.configure(ShardAllocationModule.java:46)
while locating java.util.Set<org.elasticsearch.cluster.routing.allocation.NodeAllocation>
for parameter 1 at org.elasticsearch.cluster.routing.allocation.NodeAllocations.(NodeAllocations.java:52)
while locating org.elasticsearch.cluster.routing.allocation.NodeAllocations
for parameter 1 at org.elasticsearch.cluster.routing.allocation.ShardsAllocation.(ShardsAllocation.java:53)
while locating org.elasticsearch.cluster.routing.allocation.ShardsAllocation
for parameter 3 at org.elasticsearch.cluster.action.shard.ShardStateAction.(ShardStateAction.java:64)
while locating org.elasticsearch.cluster.action.shard.ShardStateAction
for parameter 5 at org.elasticsearch.action.deletebyquery.TransportShardDeleteByQueryAction.(TransportShardDeleteByQueryAction.java:44)
while locating org.elasticsearch.action.deletebyquery.TransportShardDeleteByQueryAction
for parameter 4 at org.elasticsearch.action.deletebyquery.TransportIndexDeleteByQueryAction.(TransportIndexDeleteByQueryAction.java:41)
while locating org.elasticsearch.action.deletebyquery.TransportIndexDeleteByQueryAction
for parameter 4 at org.elasticsearch.action.deletebyquery.TransportDeleteByQueryAction.(TransportDeleteByQueryAction.java:41)
while locating org.elasticsearch.action.deletebyquery.TransportDeleteByQueryAction
for parameter 5 at org.elasticsearch.action.admin.indices.mapping.delete.TransportDeleteMappingAction.(TransportDeleteMappingAction.java:62)
while locating org.elasticsearch.action.admin.indices.mapping.delete.TransportDeleteMappingAction
for parameter 5 at org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction.(TransportDeleteIndexAction.java:57)
while locating org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction
for parameter 4 at org.elasticsearch.client.node.NodeIndicesAdminClient.(NodeIndicesAdminClient.java:129)
while locating org.elasticsearch.client.node.NodeIndicesAdminClient
for parameter 2 at org.elasticsearch.client.node.NodeAdminClient.(NodeAdminClient.java:39)

Not catastrophic in itself since the file path error should be cleaned-up anyway to make the app working,
but probably interesting to check the logic leading to this kind of infinite "catch/try again" loop.


(Shay Banon) #2

There isn't really an infinite loop, it just tries 50 times to obtain a lock ... . The user experience is annoying though, need to think of how to improve it...

On Wednesday, May 25, 2011 at 6:42 PM, Pascal Pochet wrote:

When using Java API and starting an embedded server,
we accidentally discovered this behavior :

If for any reason you have an error in the path defined by

gateway.fs.location

Your application will of course fail if the path is unaccessible but also will be stuck in an infinite loop logging tons of:

Error injecting constructor, java.io.IOException: Failed to obtain node lock
at org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:47)
while locating org.elasticsearch.env.NodeEnvironment
for parameter 6 at org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData.(TransportNodesListShardStoreMetaData.java:72)
while locating org.elasticsearch.indices.store.TransportNodesListShardStoreMetaData
for parameter 2 at org.elasticsearch.gateway.blobstore.BlobReuseExistingNodeAllocation.(BlobReuseExistingNodeAllocation.java:68)
while locating org.elasticsearch.gateway.blobstore.BlobReuseExistingNodeAllocation
while locating org.elasticsearch.cluster.routing.allocation.NodeAllocation annotated with @org.elasticsearch.common.inject.multibindings.Element(setName=,uniqueId=9)
at org.elasticsearch.cluster.routing.allocation.ShardAllocationModule.configure(ShardAllocationModule.java:46)
while locating java.util.Set<org.elasticsearch.cluster.routing.allocation.NodeAllocation>
for parameter 1 at org.elasticsearch.cluster.routing.allocation.NodeAllocations.(NodeAllocations.java:52)
while locating org.elasticsearch.cluster.routing.allocation.NodeAllocations
for parameter 1 at org.elasticsearch.cluster.routing.allocation.ShardsAllocation.(ShardsAllocation.java:53)
while locating org.elasticsearch.cluster.routing.allocation.ShardsAllocation
for parameter 3 at org.elasticsearch.cluster.action.shard.ShardStateAction.(ShardStateAction.java:64)
while locating org.elasticsearch.cluster.action.shard.ShardStateAction
for parameter 5 at org.elasticsearch.action.deletebyquery.TransportShardDeleteByQueryAction.(TransportShardDeleteByQueryAction.java:44)
while locating org.elasticsearch.action.deletebyquery.TransportShardDeleteByQueryAction
for parameter 4 at org.elasticsearch.action.deletebyquery.TransportIndexDeleteByQueryAction.(TransportIndexDeleteByQueryAction.java:41)
while locating org.elasticsearch.action.deletebyquery.TransportIndexDeleteByQueryAction
for parameter 4 at org.elasticsearch.action.deletebyquery.TransportDeleteByQueryAction.(TransportDeleteByQueryAction.java:41)
while locating org.elasticsearch.action.deletebyquery.TransportDeleteByQueryAction
for parameter 5 at org.elasticsearch.action.admin.indices.mapping.delete.TransportDeleteMappingAction.(TransportDeleteMappingAction.java:62)
while locating org.elasticsearch.action.admin.indices.mapping.delete.TransportDeleteMappingAction
for parameter 5 at org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction.(TransportDeleteIndexAction.java:57)
while locating org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction
for parameter 4 at org.elasticsearch.client.node.NodeIndicesAdminClient.(NodeIndicesAdminClient.java:129)
while locating org.elasticsearch.client.node.NodeIndicesAdminClient
for parameter 2 at org.elasticsearch.client.node.NodeAdminClient.(NodeAdminClient.java:39)

Not catastrophic in itself since the file path error should be cleaned-up anyway to make the app working,
but probably interesting to check the logic leading to this kind of infinite "catch/try again" loop.


(Pascal P. Pochet) #3

50 ?
FYI, I stopped it with counter already > 300…
so maybe more than one aspect to check here.

(And sorry I forgot to specify: ES version 0.16.1)

On 26 mai, 13:02, Shay Banon shay.ba...@elasticsearch.com wrote:

There isn't really an infinite loop, it just tries 50 times to obtain a lock ... . The user experience is annoying though, need to think of how to improve it...


(Shay Banon) #4

Which counter?

On Thursday, May 26, 2011 at 6:18 PM, P3 wrote:

50 ?
FYI, I stopped it with counter already > 300…
so maybe more than one aspect to check here.

(And sorry I forgot to specify: ES version 0.16.1)

On 26 mai, 13:02, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

There isn't really an infinite loop, it just tries 50 times to obtain a lock ... . The user experience is annoying though, need to think of how to improve it...


(Pascal P. Pochet) #5

the one in the log prefixing the "Error injecting constructor" line:

  1. Error injecting constructor, java.io.IOException: Failed to
    obtain node lock

On 26 mai, 22:59, Shay Banon shay.ba...@elasticsearch.com wrote:

Which counter?

On Thursday, May 26, 2011 at 6:18 PM, P3 wrote:

50 ?
FYI, I stopped it with counter already > 300…
so maybe more than one aspect to check here.

(And sorry I forgot to specify: ES version 0.16.1)

On 26 mai, 13:02, Shay Banon <shay.ba...@elasticsearch.com (http://elasticsearch.com)> wrote:

There isn't really an infinite loop, it just tries 50 times to obtain a lock ... . The user experience is annoying though, need to think of how to improve it...


(fashionalwallet) #6
  • deleted -

(system) #7