Hi, we'd been using ES for a while now. Specifically version 0.90.3. A
couple of months ago we decided to migrate to the latest version which was
finally frozen to be 1.4.1. No data migration was necessary because we have
a redundant MongoDB, but yesterday we enabled data writing to the new ES
cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total elapsed
time: 1065 ms.
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException:
failed to process cluster event (acquire index lock) within 1s
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy - Scattering
to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseFailureIfHaveNotEnoughActiveShardCopies(TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based indices. For
example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working flawlessly
(with no complaints) but the new version doesn't seem to handle that
feature very well. It looks like, when all those concurrent writes competes
to be the first to create the index, all but one fails. Of course we could
just create such indices manually to avoid this situation altogether, but
this would only be a workaround for a feature that previously worked.
Also, we use ES through the native Java client and the configuration for
all our indices is
Hi, we'd been using ES for a while now. Specifically version 0.90.3. A
couple of months ago we decided to migrate to the latest version which was
finally frozen to be 1.4.1. No data migration was necessary because we have
a redundant MongoDB, but yesterday we enabled data writing to the new ES
cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total elapsed
time: 1065 ms.
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException:
failed to process cluster event (acquire index lock) within 1s
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy - Scattering
to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseFailureIfHaveNotEnoughActiveShardCopies(TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based indices. For
example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working flawlessly
(with no complaints) but the new version doesn't seem to handle that
feature very well. It looks like, when all those concurrent writes competes
to be the first to create the index, all but one fails. Of course we could
just create such indices manually to avoid this situation altogether, but
this would only be a workaround for a feature that previously worked.
Also, we use ES through the native Java client and the configuration for
all our indices is
Hi, we'd been using ES for a while now. Specifically version 0.90.3. A
couple of months ago we decided to migrate to the latest version which was
finally frozen to be 1.4.1. No data migration was necessary because we have
a redundant MongoDB, but yesterday we enabled data writing to the new ES
cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total elapsed
time: 1065 ms.
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException:
failed to process cluster event (acquire index lock) within 1s
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms
org.elasticsearch.action.UnavailableShardsException: [ind-2015.01.08.00][0]
Not enough active copies to meet write consistency of [QUORUM] (have 1,
needed 2). Timeout: [1s], request: index {[ind-2015.01.08.00][search][...]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseFailureIfHaveNotEnoughActiveShardCopies(TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based indices.
For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the configuration for
all our indices is
Also, we have another cluster (for different purposes) that has 3 nodes but
we didn't experience such errors with it (for this ES we create indices on
a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version 0.90.3. A
couple of months ago we decided to migrate to the latest version which was
finally frozen to be 1.4.1. No data migration was necessary because we have
a redundant MongoDB, but yesterday we enabled data writing to the new ES
cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total elapsed
time: 1065 ms.
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException:
failed to process cluster event (acquire index lock) within 1s
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms
org.elasticsearch.action.UnavailableShardsException: [ind-2015.01.08.00][0]
Not enough active copies to meet write consistency of [QUORUM] (have 1,
needed 2). Timeout: [1s], request: index {[ind-2015.01.08.00][search][...]}
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseFailureIfHaveNotEnoughActiveShardCopies(TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based indices.
For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the configuration for
all our indices is
Also, we have another cluster (for different purposes) that has 3 nodes
but we didn't experience such errors with it (for this ES we create indices
on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version 0.90.3. A
couple of months ago we decided to migrate to the latest version which was
finally frozen to be 1.4.1. No data migration was necessary because we have
a redundant MongoDB, but yesterday we enabled data writing to the new ES
cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total elapsed
time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster event
(acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.run(
MetaDataCreateIndexService.java:148) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
raiseFailureIfHaveNotEnoughActiveShardCopies(
TransportShardReplicationOperationAction.java:776) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction$1.
run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based indices.
For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the configuration
for all our indices is
We enlarged our cluster to 5 nodes and now the QUORUM error message seems
to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind of
messages are still happening though.
Tom;
On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi trossi@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message seems
to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind of
messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3 nodes
but we didn't experience such errors with it (for this ES we create indices
on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version 0.90.3.
A couple of months ago we decided to migrate to the latest version which
was finally frozen to be 1.4.1. No data migration was necessary because we
have a redundant MongoDB, but yesterday we enabled data writing to the new
ES cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster event
(acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
run(MetaDataCreateIndexService.java:148) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
raiseFailureIfHaveNotEnoughActiveShardCopies(
TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction$1.
run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the configuration
for all our indices is
It seems there are more than one process trying to create the index, it
that possible?
Jörg
On Fri, Jan 9, 2015 at 7:16 PM, Tomas Andres Rossi trossi@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message seems
to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind of
messages are still happening though.
Tom;
On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi trossi@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message seems
to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind of
messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3 nodes
but we didn't experience such errors with it (for this ES we create indices
on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version 0.90.3.
A couple of months ago we decided to migrate to the latest version which
was finally frozen to be 1.4.1. No data migration was necessary because we
have a redundant MongoDB, but yesterday we enabled data writing to the new
ES cluster. All was running smoothly when we noticed that at o'clock times
there were bursts of four or five log messages of the following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster
event (acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
run(MetaDataCreateIndexService.java:148) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
raiseFailureIfHaveNotEnoughActiveShardCopies(
TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the configuration
for all our indices is
Well yes. We also have a cluster for the app where each node talks to the
elastic cluster independently.
Remember that we are not creating the index manually. Each app node issues
an index operation on an index that may yet not exist and we expect ES to
take care of the index creation on demand. Many processes may issue the
same indexing operation on the ES cluster "simultaneously" and only one of
them must succeed in triggering the index creation.
Tom;
El viernes, 9 de enero de 2015, 15:53:01 (UTC-3), Jörg Prante escribió:
It seems there are more than one process trying to create the index, it
that possible?
Jörg
On Fri, Jan 9, 2015 at 7:16 PM, Tomas Andres Rossi <tro...@despegar.com
<javascript:>> wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message seems
to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind of
messages are still happening though.
Tom;
On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi <tro...@despegar.com
<javascript:>> wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3
nodes but we didn't experience such errors with it (for this ES we create
indices on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version
0.90.3. A couple of months ago we decided to migrate to the latest version
which was finally frozen to be 1.4.1. No data migration was necessary
because we have a redundant MongoDB, but yesterday we enabled data writing
to the new ES cluster. All was running smoothly when we noticed that at
o'clock times there were bursts of four or five log messages of the
following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster
event (acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
run(MetaDataCreateIndexService.java:148) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
retryBecauseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
raiseFailureIfHaveNotEnoughActiveShardCopies(
TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$AsyncShardOperationAction.
performOnPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the
configuration for all our indices is
Sorry, didn't mean to say "the same indexing operation" but multiple
indexing operations (distinct data) on the same non-existent index.
El viernes, 9 de enero de 2015, 16:13:52 (UTC-3), Tom escribió:
Well yes. We also have a cluster for the app where each node talks to the
elastic cluster independently.
Remember that we are not creating the index manually. Each app node issues
an index operation on an index that may yet not exist and we expect ES to
take care of the index creation on demand. Many processes may issue the
same indexing operation on the ES cluster "simultaneously" and only one of
them must succeed in triggering the index creation.
Tom;
El viernes, 9 de enero de 2015, 15:53:01 (UTC-3), Jörg Prante escribió:
It seems there are more than one process trying to create the index, it
that possible?
Jörg
On Fri, Jan 9, 2015 at 7:16 PM, Tomas Andres Rossi tro...@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
Tom;
On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi tro...@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3
nodes but we didn't experience such errors with it (for this ES we create
indices on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version
0.90.3. A couple of months ago we decided to migrate to the latest version
which was finally frozen to be 1.4.1. No data migration was necessary
because we have a redundant MongoDB, but yesterday we enabled data writing
to the new ES cluster. All was running smoothly when we noticed that at
o'clock times there were bursts of four or five log messages of the
following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster
event (acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.
MetaDataCreateIndexService$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction.retryBecauseUnavailable(
TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction.raiseFailureIfHaveNotEnoughAct
iveShardCopies(TransportShardReplicationOperationAction.java:776)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.
TransportShardReplicationOperationAction$
AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the
configuration for all our indices is
I think you can safely ignore "failed to process cluster event (acquire
index lock) within 1s" in that case. These messages come from index
creation requests that are submitted concurrently - only one request can
succeed, the other will get stuck.
Sorry, didn't mean to say "the same indexing operation" but multiple
indexing operations (distinct data) on the same non-existent index.
El viernes, 9 de enero de 2015, 16:13:52 (UTC-3), Tom escribió:
Well yes. We also have a cluster for the app where each node talks to the
elastic cluster independently.
Remember that we are not creating the index manually. Each app node
issues an index operation on an index that may yet not exist and we expect
ES to take care of the index creation on demand. Many processes may issue
the same indexing operation on the ES cluster "simultaneously" and only one
of them must succeed in triggering the index creation.
Tom;
El viernes, 9 de enero de 2015, 15:53:01 (UTC-3), Jörg Prante escribió:
It seems there are more than one process trying to create the index, it
that possible?
Jörg
On Fri, Jan 9, 2015 at 7:16 PM, Tomas Andres Rossi tro...@despegar.com
wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3
nodes but we didn't experience such errors with it (for this ES we create
indices on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version
0.90.3. A couple of months ago we decided to migrate to the latest version
which was finally frozen to be 1.4.1. No data migration was necessary
because we have a redundant MongoDB, but yesterday we enabled data writing
to the new ES cluster. All was running smoothly when we noticed that at
o'clock times there were bursts of four or five log messages of the
following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster
event (acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexServic
e$1.run(MetaDataCreateIndexService.java:148) ~[org.elasticsearch.
elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.retryBec
auseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.raiseFai
lureIfHaveNotEnoughActiveShardCopies(TransportShardReplicati
onOperationAction.java:776) ~[org.elasticsearch.elasticsea
rch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.performO
nPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction$1.run(Tr
ansportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the
configuration for all our indices is
¿So you're saying that those indexing operations associated with the
unsuccessful create-index requests will however succeed (i.e. all data will
be stored)?
I think you can safely ignore "failed to process cluster event (acquire
index lock) within 1s" in that case. These messages come from index
creation requests that are submitted concurrently - only one request can
succeed, the other will get stuck.
Sorry, didn't mean to say "the same indexing operation" but multiple
indexing operations (distinct data) on the same non-existent index.
El viernes, 9 de enero de 2015, 16:13:52 (UTC-3), Tom escribió:
Well yes. We also have a cluster for the app where each node talks to
the elastic cluster independently.
Remember that we are not creating the index manually. Each app node
issues an index operation on an index that may yet not exist and we expect
ES to take care of the index creation on demand. Many processes may issue
the same indexing operation on the ES cluster "simultaneously" and only one
of them must succeed in triggering the index creation.
Tom;
El viernes, 9 de enero de 2015, 15:53:01 (UTC-3), Jörg Prante escribió:
It seems there are more than one process trying to create the index, it
that possible?
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s" kind
of messages are still happening though.
Tom;
On Fri, Jan 9, 2015 at 3:11 PM, Tomas Andres Rossi < tro...@despegar.com> wrote:
We enlarged our cluster to 5 nodes and now the QUORUM error message
seems to have disappeared.
"failed to process cluster event (acquire index lock) within 1s"
kind of messages are still happening though.
Please, always use an odd number of data nodes, in particular with
replica > 0, in order not to confuse ES quorum formula, and also to avoid
split brains with minimun_master_nodes
Also, we have another cluster (for different purposes) that has 3
nodes but we didn't experience such errors with it (for this ES we create
indices on a daily basis).
El jueves, 8 de enero de 2015, 16:23:12 (UTC-3), Tom escribió:
4
El jueves, 8 de enero de 2015 16:19:50 UTC-3, Jörg Prante escribió:
Hi, we'd been using ES for a while now. Specifically version
0.90.3. A couple of months ago we decided to migrate to the latest version
which was finally frozen to be 1.4.1. No data migration was necessary
because we have a redundant MongoDB, but yesterday we enabled data writing
to the new ES cluster. All was running smoothly when we noticed that at
o'clock times there were bursts of four or five log messages of the
following kinds:
Error indexing None into index ind-analytics-2015.01.08. Total
elapsed time: 1065 ms. org.elasticsearch.cluster.metadata.
ProcessClusterEventTimeoutException: failed to process cluster
event (acquire index lock) within 1s
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexServic
e$1.run(MetaDataCreateIndexService.java:148)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
[ForkJoinPool-2-worker-15] c.d.i.p.ActorScatterGatherStrategy -
Scattering to failed in 1043ms org.elasticsearch.action.UnavailableShardsException:
[ind-2015.01.08.00][0] Not enough active copies to meet write consistency
of [QUORUM] (have 1, needed 2). Timeout: [1s], request: index
{[ind-2015.01.08.00][search][...]}
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.retryBec
auseUnavailable(TransportShardReplicationOperationAction.java:784)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.raiseFai
lureIfHaveNotEnoughActiveShardCopies(TransportShardReplicati
onOperationAction.java:776) ~[org.elasticsearch.elasticsea
rch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction.performO
nPrimary(TransportShardReplicationOperationAction.java:507)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at org.elasticsearch.action.support.replication.TransportShardR
eplicationOperationAction$AsyncShardOperationAction$1.run(Tr
ansportShardReplicationOperationAction.java:419)
~[org.elasticsearch.elasticsearch-1.4.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_17]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_17]
at java.lang.Thread.run(Thread.java:722) ~[na:1.7.0_17]
This occurs at o'clock times because we write over hour-based
indices. For example, all writes from 18:00:00 to 18:59:59 of 01/08 goes to
ind-2015.01.08.18. At 19:00:00 all writes will go to ind-2015.01.08.19, and
so on.
With 0.90.3 version of ES, automatic index creation was working
flawlessly (with no complaints) but the new version doesn't seem to handle
that feature very well. It looks like, when all those concurrent writes
competes to be the first to create the index, all but one fails. Of course
we could just create such indices manually to avoid this situation
altogether, but this would only be a workaround for a feature that
previously worked.
Also, we use ES through the native Java client and the
configuration for all our indices is
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.