Performance tuning for alias creation

Hi,

we've recently started using Elasticsearch within our application and are
experiencing slow response times when creating aliases. We've adopted the
'users data flow' model described in this presentation -
http://www.elasticsearch.org/videos/big-data-search-and-analytics/ - and
are generating filtered aliases for each user in the system.

As part of our automated test process we are creating a number of new users
in a short space of time. This, in turn, triggers the generation of new
indices within Elasticsearch and this is where we are experiencing
unexpected delays. For example, during a recent test run we created 44 new
aliases within a 70 second period. The time taken for these operations to
complete began at around 2 seconds but had dropped off to 69 seconds by the
end. This represents a particularly extreme case, but response times of
over 20 seconds are quite common. Generally the pattern is the same.
Response times will be good to start with and then gradually degrade as
more requests are sent.

In terms of our set-up, we have a single Elasticsearch node shared between
multiple (~8) environments. Each environment has a separate index with 5
shards allocated to it. Our largest shards are storing around 2 million
documents, all of which are reasonably small.

My questions are:

  • Is this kind of behaviour expected when attempting to create multiple
    aliases in a relatively short space of time?
  • Are there any settings that we can experiment with to improve
    performance?
  • Is this likely to be a side effect of our current architecture (i.e.
    sharing a single node between several environments)?
  • What are the most useful stats that we can monitor on the server to
    help diagnose the cause of the delays?

Any insight you can offer will be greatly appreciated.

Thanks,
Paul

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Paul,

this looks really strange. There is a recent change about aliases
processing in the master branch. Is it possible for you to test this?
For more info see


On Thu, Mar 28, 2013 at 7:43 PM, paul.mclellan@globalrelay.com wrote:

Hi,

we've recently started using Elasticsearch within our application and are
experiencing slow response times when creating aliases. We've adopted the
'users data flow' model described in this presentation -
http://www.elasticsearch.org/videos/big-data-search-and-analytics/ - and
are generating filtered aliases for each user in the system.

As part of our automated test process we are creating a number of new
users in a short space of time. This, in turn, triggers the generation of
new indices within Elasticsearch and this is where we are experiencing
unexpected delays. For example, during a recent test run we created 44 new
aliases within a 70 second period. The time taken for these operations to
complete began at around 2 seconds but had dropped off to 69 seconds by the
end. This represents a particularly extreme case, but response times of
over 20 seconds are quite common. Generally the pattern is the same.
Response times will be good to start with and then gradually degrade as
more requests are sent.

In terms of our set-up, we have a single Elasticsearch node shared between
multiple (~8) environments. Each environment has a separate index with 5
shards allocated to it. Our largest shards are storing around 2 million
documents, all of which are reasonably small.

My questions are:

  • Is this kind of behaviour expected when attempting to create
    multiple aliases in a relatively short space of time?
  • Are there any settings that we can experiment with to improve
    performance?
  • Is this likely to be a side effect of our current architecture (i.e.
    sharing a single node between several environments)?
  • What are the most useful stats that we can monitor on the server to
    help diagnose the cause of the delays?

Any insight you can offer will be greatly appreciated.

Thanks,
Paul

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks Alexander.

Should be possible to test the latest code on our server. I'll give it a
whirl and see if things improve.

Cheers,
Paul

On Tuesday, 2 April 2013 00:18:10 UTC-7, Alexander Reelsen wrote:

Hey Paul,

this looks really strange. There is a recent change about aliases
processing in the master branch. Is it possible for you to test this?
For more info see

https://github.com/elasticsearch/elasticsearch/issues/2832

https://github.com/elasticsearch/elasticsearch/commit/b657bdfa1a9467848cc1844b5c732087e5eae1ca

On Thu, Mar 28, 2013 at 7:43 PM, <paul.m...@globalrelay.com <javascript:>>wrote:

Hi,

we've recently started using Elasticsearch within our application and are
experiencing slow response times when creating aliases. We've adopted the
'users data flow' model described in this presentation -
http://www.elasticsearch.org/videos/big-data-search-and-analytics/ - and
are generating filtered aliases for each user in the system.

As part of our automated test process we are creating a number of new
users in a short space of time. This, in turn, triggers the generation of
new indices within Elasticsearch and this is where we are experiencing
unexpected delays. For example, during a recent test run we created 44 new
aliases within a 70 second period. The time taken for these operations to
complete began at around 2 seconds but had dropped off to 69 seconds by the
end. This represents a particularly extreme case, but response times of
over 20 seconds are quite common. Generally the pattern is the same.
Response times will be good to start with and then gradually degrade as
more requests are sent.

In terms of our set-up, we have a single Elasticsearch node shared
between multiple (~8) environments. Each environment has a separate index
with 5 shards allocated to it. Our largest shards are storing around 2
million documents, all of which are reasonably small.

My questions are:

  • Is this kind of behaviour expected when attempting to create
    multiple aliases in a relatively short space of time?
  • Are there any settings that we can experiment with to improve
    performance?
  • Is this likely to be a side effect of our current architecture
    (i.e. sharing a single node between several environments)?
  • What are the most useful stats that we can monitor on the server to
    help diagnose the cause of the delays?

Any insight you can offer will be greatly appreciated.

Thanks,
Paul

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

haven't had a chance to test the latest code yet, got distracted by other
work. However, I wanted to post an update concerning another possible
bottleneck for alias creation.

The InternalClusterService is using an executor service with a single
thread to process all the alias updates it receives. This, in turn, hits
the IndicesClusterStateService where the new aliases are added to the
IndexAliasesService. The processing carried out by the
IndicesClusterStateService is done within a synchronized block so I was
wondering if this could be contributing to the slow-down we're
experiencing? Are the InternalClusterService and IndicesClusterStateService
shared across all indices on a single node? Is it possible that a large
number of alias requests could cause a backlog of ClusterStateUpdateTasks
in the InternalClusterService?

Cheers,
Paul

On Tuesday, 2 April 2013 09:59:29 UTC-7, paul.m...@globalrelay.com wrote:

Thanks Alexander.

Should be possible to test the latest code on our server. I'll give it a
whirl and see if things improve.

Cheers,
Paul

On Tuesday, 2 April 2013 00:18:10 UTC-7, Alexander Reelsen wrote:

Hey Paul,

this looks really strange. There is a recent change about aliases
processing in the master branch. Is it possible for you to test this?
For more info see

https://github.com/elasticsearch/elasticsearch/issues/2832

https://github.com/elasticsearch/elasticsearch/commit/b657bdfa1a9467848cc1844b5c732087e5eae1ca

On Thu, Mar 28, 2013 at 7:43 PM, paul.m...@globalrelay.com wrote:

Hi,

we've recently started using Elasticsearch within our application and
are experiencing slow response times when creating aliases. We've adopted
the 'users data flow' model described in this presentation -
http://www.elasticsearch.org/videos/big-data-search-and-analytics/ -
and are generating filtered aliases for each user in the system.

As part of our automated test process we are creating a number of new
users in a short space of time. This, in turn, triggers the generation of
new indices within Elasticsearch and this is where we are experiencing
unexpected delays. For example, during a recent test run we created 44 new
aliases within a 70 second period. The time taken for these operations to
complete began at around 2 seconds but had dropped off to 69 seconds by the
end. This represents a particularly extreme case, but response times of
over 20 seconds are quite common. Generally the pattern is the same.
Response times will be good to start with and then gradually degrade as
more requests are sent.

In terms of our set-up, we have a single Elasticsearch node shared
between multiple (~8) environments. Each environment has a separate index
with 5 shards allocated to it. Our largest shards are storing around 2
million documents, all of which are reasonably small.

My questions are:

  • Is this kind of behaviour expected when attempting to create
    multiple aliases in a relatively short space of time?
  • Are there any settings that we can experiment with to improve
    performance?
  • Is this likely to be a side effect of our current architecture
    (i.e. sharing a single node between several environments)?
  • What are the most useful stats that we can monitor on the server
    to help diagnose the cause of the delays?

Any insight you can offer will be greatly appreciated.

Thanks,
Paul

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.