I want to evaluate the impact of running elasticsearch as docker container instead of a classic / native installation. So I wanted to run and compare the external car of docker container and classic installation we have and I want to race on geonames, http logs and eventdata tracks.
Also I want to run each race three times and average the values, because in the first two iteration I had differences on indexing time and 50% percentiles of about 3-4% in same testset and on large_filtered_terms I have differences of about 12% between two runs.
I am a bit curious why I have such big differences. Nevertheless I wanted to workaround these by building some average of the results and comparing avg of docker vs avg of native installation.
Before accidently reinventing the wheel I just wanted to ask:
Is there any out of the box way to send the results to a separate elasticsearch instance to access the result data via kibana?
are there any prebuild dashboards?
Is there already a function to compare a baseline to more than one contender? Not racing one on one, just with a full race grid
Is there already a function to calculate the average result of multiple racers?
What experience do you have about differences in the results when rerunning the test? Is 12% normal?
There are no pre-built dashboards. We have it on our roadmap but it’s not a high priority item.
We generally don’t provide dashboards because there are a lot of different possibilities for how you may want to use the data. To create your own dashboard please query rally-results*. Take a look at what metrics we collect. Another tip is to use user tags --user-tag to simplify the filtering.
Is there already a function to compare a baseline to more than one contender? Not racing one on one, just with a full race grid
The tournament functionality is only meant for simple cases. For more complex cases it is best to analyze the data yourself in Kibana.
Is there already a function to calculate the average result of multiple racers?
No, we don’t have a whole bunch of analysis capability in Rally by design; and Kibana (or other tools) can be used to for more complex analysis.
What experience do you have about differences in the results when rerunning the test? Is 12% normal?
thanks a lot for your reply. I configured the tested cluster to push the metrics to a separate monitoring cluster. Also I sen't metricbeat data to that monitoring cluster to have a complete overview over the system during the testruns.
Is the source code / configuration for your nightly dashboards available somewhere to download?
I have some issues understanding the values / fields which are stored in rally-metrics-* index. What is the meaning of following fields?
value (Is it a cummulative value which raises which each probe?)
meta.took
meta.hits
I did not find a description of them in esrally documentation. So If I want to visualize the latency or service_time as histogram over the time for painless_dynamic, which fields do I need to take?
I would filter index rally-metrics-* for "name: latency AND sample-type: normal", but which field contains my data?
When I take the field value as data source, it it looks like the following as histogram:
. So it looks like a cummulated value, but if I just check the same for index-append, it looks like a current value:
Is the source code / configuration for your nightly dashboards available somewhere to download?
At this point it is not available. We do have a way to generate charts in Rally: https://github.com/elastic/rally/blob/master/esrally/chart_generator.py. Please NOTE at this point this is considered experimental and is thus intentionally undocumented. There is a mode in the chart generator that lets you generate charts for a single combination of a track, challenge, car and node-count. You could try something like esrally generate charts --track=geonames --challenge=append-no-conflicts --chart-type=time-series --node-count=1 --car=4gheap --output-path=output-my-charts.json
value (Is it a cummulative value which raises which each probe?)
It is not cumulative
meta.took
This is what Elasticsearch returns for how long the query took to complete
meta.hits
This is also what Elsticsearch returns for successful hits of the query
So it looks like a cumulated value, but if I just check the same for index-append, it looks like a current value:
Rally reports service time as time it took to process the request form the time it sent it to Elasticsearch and got the reply back. Latency is service time plus extra waiting time of the requests. For more information please take a look at FAQ. From your graph it looks like there might be some contention in your setup.
So do I understand it correctly?
Esrally is querying data at a fixed rate. The elasticsearch backend responds slower than the esrally is querying. So the queue before processing the the query is increasing over the time.
so target throughput is 1.5 and our system is only capable of about 1.0 to 0.75.
That would explain the raising latency.
Did I understood correctly?
What if my system was capable of a throughput of 5.0. Rally then would say max throughput =1.5 because rally was not requesting more (so just showing the real measurement), or would I see sth. around 5.0 because rally would calculate it based on latency and busy and idle time?
Is somewhere documented which ressources are mainly effecting the throughput? Because load and cpu are not on the limits when the latency is increasing. On first watch I also see no issues on disk and network.
If you set target throughput to 1.5 operations per second then Rally will try to achieve that but will not go beyond that. For a realistic benchmark you should choose a target throughput that you also see in your production system. Say you have an e-commerce website and you have 100 concurrent users and each of them is hitting your search page roughly once every 10 seconds you should see ~ 10 queries being issued per second. If you want to benchmark that scenario you should set a target throughput of 10 (operations per second).
No and I don't think it would be helpful. Every system is different. If you have a slow disk, the disk might be the bottleneck, if you have a fast disk but only one CPU core, then it might be the CPU. Therefore, you should measure this. The USE method is a good start.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.