Premature end of Benchmark run


(Sanket) #1

The rally I run to benchmark a remote cluster ends with a following warning:

[WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup.

I initially used warmup timeouts of 120s, then changed to values that are mentioned below. Still get the same warning.

Any pointers?


(Sanket) #2

Here is my track.json:

{
"short-description": "Benchmarking the ES cluster",
"description": "The benchmarking tasks to be performed against the cluster",
"indices": [
{
"name": "es-benchmarking",
"types": [
{
"name": "type",
"mapping": "conversationHistory_mapping.json",
"documents": "ch.json.bz2",
"document-count": 10,
"compressed-bytes": 1,
"uncompressed-bytes": 2
}
]
}
],
"operations": [
{
"name": "index-append",
"operation-type": "index",
"bulk-size": 5000
},
{
"name": "index-update",
"operation-type": "index",
"bulk-size": 5000,
"conflicts": "random"
},
{
"name": "force-merge",
"operation-type": "force-merge"
},
{
"name": "index-stats",
"operation-type": "index-stats"
},
{
"name": "node-stats",
"operation-type": "node-stats"
},
{
"name": "default",
"operation-type": "search",
"body": {
"query": {
"match_all": {}
}
}
},
{
"name": "scroll",
"operation-type": "search",
"pages": 25,
"results-per-page": 1000,
"body": {
"query": {
"match_all": {}
}
}
}
],
"challenges": [
{
"name": "append-no-conflicts",
"description": "Indexes the whole document corpus using Elasticsearch default settings. We only adjust the number of replicas as we benchmark a single node cluster and Rally will only start the benchmark if the cluster turns green. Document ids are unique so all index operations are append only. After that a couple of queries are run.",
"default": true,
"index-settings": {
"index.number_of_replicas": 0
},
"schedule": [
{
"operation": "index-append",
"warmup-time-period": 20,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
},
{
"operation": "index-stats",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "node-stats",
"clients": 1,
"warmup-iterations": 100,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "default",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 50
},
{
"operation": "scroll",
"clients": 1,
"warmup-iterations": 200,
"iterations": 500,
"target-throughput": 25
}
]
},
{
"name": "append-no-conflicts-index-only",
"description": "Indexes ",
"index-settings": {
"index.number_of_replicas": 0
},
"schedule": [
{
"operation": "index-append",
"warmup-time-period": 20,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
}
]
},
{
"#COMMENT": "Temporary workaround for more realistic benchmarks with two nodes",
"name": "append-no-conflicts-index-only-1-replica",
"description": "Indexes .",
"index-settings": {
"index.number_of_replicas": 1
},
"schedule": [
{
"operation": "index-append",
"warmup-time-period": 20,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
}
]
},
{
"name": "append-fast-with-conflicts",
"description": "Indexes the whole ",
"index-settings": {
"index.number_of_replicas": 0,
"index.refresh_interval": "30s",
"index.number_of_shards": 6,
"index.translog.flush_threshold_size": "4g"
},
"schedule": [
{
"operation": "index-update",
"warmup-time-period": 5,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
}
]
},
{
"name": "append-no-conflicts-no-large-terms",
"description": "Indexes the whole document ",
"user-info": "This challenge .",
"index-settings": {
"index.number_of_replicas": 0
},
"schedule": [
{
"operation": "index-append",
"warmup-time-period": 20,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
},
{
"operation": "index-stats",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "node-stats",
"clients": 1,
"warmup-iterations": 100,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "default",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 50
},
{
"operation": "scroll",
"clients": 1,
"warmup-iterations": 200,
"iterations": 500,
"target-throughput": 25
}
]
},
{
"name": "search-only",
"description": "Same as default challenge except it only runs the search operations",
"user-info": "This .",
"index-settings": {
"index.number_of_replicas": 0
},
"schedule": [
{
"operation": "index-append",
"warmup-time-period": 20,
"clients": 8
},
{
"operation": "force-merge",
"clients": 1
},
{
"operation": "index-stats",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "node-stats",
"clients": 1,
"warmup-iterations": 100,
"iterations": 1000,
"target-throughput": 100
},
{
"operation": "default",
"clients": 1,
"warmup-iterations": 500,
"iterations": 1000,
"target-throughput": 50
},
{
"operation": "scroll",
"clients": 1,
"warmup-iterations": 200,
"iterations": 500,
"target-throughput": 25
}
]
}
]
}


(Christian Dahlqvist) #3

The document set you are indexing is tiny, and likely to just need a single request to complete, which is why I suspect it completes during warm-up.


(Sanket) #4

This was a mistake, I edited the doc and changes the values. The dataset that I am working with is 10GB in size.


(Daniel Mitterdorfer) #5

Hi @sanketshinde,

how long did the benchmark take in total? Depending on the machines that you're running on and the document structure, it can easily happen that a 10GB document corpus is too small.

For testing you can try to set warmup-time-period to 0. Then, Rally will not do any warmup and you should see samples.

Daniel


The error rate is 100%
(Sanket) #6

I set the warmup time period in all the occurrences to 0 in the track.json. This time I got some numbers for the index-append metric. However, this time, the error rate associated was 100%. The data file that I am using is 10GB in size with 1M documents. The cluster configuration is: 3Data and 3Master nodes running on Standard D1 v2 Azure virtual machines.

Any pointers?


(Daniel Mitterdorfer) #7

Hi @sanketshinde,

is there any chance you can share your track and the data (or a subset of it) privately with me?

Daniel


(Sanket) #8

Here is a link to a post I made previously. ESRally - data.json.bz2
You can find the script that I used for Data generation here.

The track.json file looks like the one I mentioned in this post except that it has all the instances of warmup_time set to 0 as you previously suggested. The original Data contains documents with 28 fields.

Does this help?


(Sanket) #9

Here is a snapshot of the recent benchmarking run I made on an ES cluster deployed on Azure, with Standard D1 V2 machine and a heap size of 4GB.

05


(Daniel Mitterdorfer) #10

Hi @sanketshinde,

I've just released Rally 0.7.3 which includes a new flag --on-error.

Can you please upgrade and add --on-error=abort. This should immediately print a detailed description of your problem on the command line. I hope that helps.

Daniel


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.