ES Rally - indexing time increasing

Hi there,

We set up a nightly race on an existing cluster to see the evolution of the cluster's performance.

We are deleting existing indices from the previous race, indexing them again - 3, for a total of about 10G - running queries and aggregations and a force merge at the end. Everything works fine and all metrics make sense, except one: the indexing time.

Indeed, it's been gradually increasing day after day from the beginning:

The thing that startles me is that the store size, indexing speed, search throughput and latency are all constant in the same period...

Any idea what could be happening?

Thanks in advance for any help!

Wow, that steady increase is intriguing.

Is the cluster containing other data than the Rally indices? Using Rally on a production cluster is known to skew results.

One option here is to record a flamegraph during indexing using async-profiler. It will tell us where indexing is spending its time. Given the 4x increase hopefully we will see clearly what the issue is.

Thanks for your answer!

It is indeed!

It does, which is why I mentioned that all the other metrics are steady. I am aware that this is not optimal and would understand the possibility of things going in multiple directions in general, but I have a hard time grasping this.

I am not familiar with this at all, but I will try to get this done and report back here.

Thanks!

Hi @jsu

Indexing time as reported by Rally is a cumulative statistic the cluster maintains. If you were to invoke a recent version of Rally manually against your cluster you'd see a warning in the console about ensuring your cluster is in a known good state for benchmark results. If your nodes are not restarted between runs we would expect this statistic to grow over time.

Sorry, I was mistaken, this is from the index stats API, not the node stats. This is cumulative but is inclusive of all the indices on the cluster, so while you co-locate your benchmark indices on a cluster with other indices, this stat will increase.

2 Likes

Hi @RickBoyd

Thanks!

To make sure I understand correctly, what field are we talking about in the index stats API? The index_time_in_millis ?

If so, I tried to find information in the documentation but couldn't find anything specific about it. Would you be able to point in the right direction?

Do you mean that it would be the total indexing time (at the thread level) across all indices during the measure?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.