Esrally got "The benchmark ended already during warmup" when running custom track

jackgzj · June 17, 2019, 1:35pm

I am running a custom track on esrally 1.1.0. My custom index is like:

{
    "mappings": {
      "_default_": {
        "_all": {
          "enabled": false
        },
        "_field_names": {
          "enabled": false
        },
        "dynamic_templates": [
          {
            "message": {
              "match": "message",
              "mapping": {
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                },
                "type": "text"
              }
            }
          },
          {
            "strings": {
              "match_mapping_type": "string",
              "mapping": {
                "ignore_above": 256,
                "type": "keyword"
              }
            }
          }
        ]
      },
      "log": {
        "_all": {
          "enabled": false
        },
        "_field_names": {
          "enabled": false
        },
        "dynamic_templates": [
          {
            "message": {
              "match": "message",
              "mapping": {
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                },
                "type": "text"
              }
            }
          },
          {
            "strings": {
              "match_mapping_type": "string",
              "mapping": {
                "ignore_above": 256,
                "type": "keyword"
              }
            }
          }
        ],
        "properties": {
          "@timestamp": {
            "type": "date"
          },
          "app-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "appgroup-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "application-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "application-name": {
            "type": "keyword",
            "ignore_above": 256
          },
          "cluster-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "fields": {
            "properties": {
              "pipename": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "instance-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "local-ip": {
            "type": "keyword",
            "ignore_above": 256
          },
          "log-content": {
            "type": "keyword",
            "ignore_above": 256
          },
          "log-level": {
            "type": "keyword",
            "ignore_above": 256
          },
          "log-time": {
            "type": "keyword",
            "ignore_above": 256
          },
          "message": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "namespace-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "span-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "timestamp_es": {
            "type": "date"
          },
          "timestamp_ht": {
            "type": "date"
          },
          "trace-id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "type": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
}

When I run esrally failed, I tried to create the index manually and remove the create-index action from challenge schedule. But I still get [WARNING] No throughput metrics available for [index-append]. Likely cause: The benchmark ended already during warmup.[WARNING] No throughput metrics available for [random_script_score]. Likely cause: Error rate is 100.0%. Please check the logs. when I ran esrally --target-host=xxxx:9210 --pipeline=benchmark-only --track=logtest. I have made the JDK the same version on all the machines. And the related log line are:

2019-06-17 13:16:20,882 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:16:21,548 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:16:22,215 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:16:22,882 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:16:23,548 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:16:24,215 -not-actor-/PID:6704 elasticsearch WARNING GET http://172.17.0.101:9210/logtest/_search?request_cache=false [status:400 request:0.001s]

2019-06-17 13:01:55,121 -not-actor-/PID:6706 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.189s]

2019-06-17 13:01:55,173 -not-actor-/PID:6705 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.107s]

2019-06-17 13:01:55,198 -not-actor-/PID:6709 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.090s]

2019-06-17 13:01:55,225 -not-actor-/PID:6708 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.298s]

2019-06-17 13:01:55,253 -not-actor-/PID:6706 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.091s]

2019-06-17 13:01:55,277 -not-actor-/PID:6710 elasticsearch WARNING POST http://172.17.0.101:9210/_bulk [status:400 request:0.183s]

I have no idea what's wrong with my track or esrally configuration.

danielmitterdorfer · June 17, 2019, 1:51pm

Hi,

the logs indicate that you got a lot of client side errors (HTTP status 400). By default Rally will just continue and record the error rate. Please rerun with --on-error=abort. Then Rally will abort on the first error and provide specific details what went wrong.

Daniel

jackgzj · June 18, 2019, 2:08am

Thanks Daniel and I re-run the track then got

esrally.exceptions.RallyAssertionError: ("Request returned an error. Error type: transport, 
Description: action_request_validation_exception ({'error': {'root_cause': [{'type': 
'action_request_validation_exception', 'reason': 'Validation Failed: 1: type is missing;2: type is         
missing;3: type is missing;4: type is missing;5: type is missing;6: type is missing;7: type is missing;8:     
type is missing;9: type is missing;10: type is missing;11: type is missing;12: type is missing;13: type 
is missing;14: type is missing;15: type is missing;16: type is missing;17: type is missing;18: type is 
missing;19: type is missing;20: type is missing;21: type is missing;22: type is missing;23: type is 
missing;24: type is missing;25: type is missing;26: type is missing;27: type is missing;28: type is 
missing;29: type is missing;30: type is missing;31: type is missing;32: type is missing;33: type is 
missing;34: type is missing;35: type is missing;36: type is missing;37: type is missing;38: type is 
missing;39: type is missing;40: type is missing;41: type is missing;42: type is missing;43: type is 
missing;44: type is missing;45: type is missing;46: type is missing;47: type is missing;48: type is 
missing;49: type is missing;50:

It seemed that there is some wrong in my document data, but I got them from a running es cluster by elasticdump tool. Here is one record of my data

{"application-name":"demo","local-ip":"172.17.0.10","app-id":"1","instance-id":"172.17.0.10","namespace-id":"namespace-xxxx","span-id":"425aeb9b4872750e","message":"2019-06-11 15:32:28.982  INFO [demo,6d627be62ff455fd,425aeb9b4872750e,true] 3661 --- [http-nio-18081-exec-6] c.t.d.p.controller.DemoController    : demo -- config name: [echo-default-name]","type":"log","appgroup-id":"group-l9ynzydj","log-time":"2019-06-11 15:32:28.982","timestamp_ht":"2019-06-11T15:32:28.982+08:00","timestamp_es":"2019-06-11T15:32:29.809+08:00","@timestamp":"2019-06-11T07:32:29.063Z","log-level":"INFO","trace-id":"6d627be62ff455fd","log-content":"[http-nio-18081-exec-6] c.t.d.p.controller.ProviderController    : provider-demo -- provider config name: [echo-provider-default-name]","fields":{"pipename":"xxxx-l6ymbvgd"},"application-id":"application-l6ymbvgd","cluster-id":"cluster-mjy9mvgz"}

I have added this record to the index I created by POST /myindex/log successfully and I guest that the type=log is missing. I tried to modify the mapping tag of index.json file to

"mappings": {
"_all": {
  "enabled": false
},
"_field_names": {
  "enabled": false
},
"dynamic_templates": [
  {
    "message": {
      "match": "message",
      "mapping": {
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        },
        "type": "text"
      }
    }
  },
  {
    "strings": {
      "match_mapping_type": "string",
      "mapping": {
        "ignore_above": 256,
        "type": "keyword"
      }
    }
  }
],
"properties": {
  "@timestamp": {
    "type": "date"
  },
  "app-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "appgroup-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "application-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "application-name": {
    "type": "keyword",
    "ignore_above": 256
  },
  "cluster-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "fields": {
    "properties": {
      "pipename": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  },
  "instance-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "local-ip": {
    "type": "keyword",
    "ignore_above": 256
  },
  "log-content": {
    "type": "keyword",
    "ignore_above": 256
  },
  "log-level": {
    "type": "keyword",
    "ignore_above": 256
  },
  "log-time": {
    "type": "keyword",
    "ignore_above": 256
  },
  "message": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword"
      }
    }
  },
  "namespace-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "span-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "timestamp_es": {
    "type": "date"
  },
  "timestamp_ht": {
    "type": "date"
  },
  "trace-id": {
    "type": "keyword",
    "ignore_above": 256
  },
  "type": {
    "type": "keyword",
    "ignore_above": 256
  }
}
}

However esrally still gave the same error.

danielmitterdorfer · June 18, 2019, 6:04am

Hi,

it depends a bit how you've defined the corpora in your Rally track. If you don't specify includes-action-and-meta-data, Rally will automatically generate one for your bulk requests but you need to specify the involved indices in the indices section in the track, in your case you need to ensure that you've specified the type property on the index. Then Rally will use that information to create the action and metadata line in the bulk request for you. If this still does not work can you please post a minimal example of the track?

Daniel

jackgzj · June 18, 2019, 7:58am

Thanks Daniel, that works for me. But I still have a question for the test report on my custom track.
Here is my operations:

{
  "name": "index-append",
  "operation-type": "bulk",
  "bulk-size": {{bulk_size | default(5000)}},
  "ingest-percentage": {{ingest_percentage | default(100)}}
},
{
  "name": "index-update",
  "operation-type": "bulk",
  "bulk-size": {{bulk_size | default(5000)}},
  "ingest-percentage": {{ingest_percentage | default(100)}},
  "conflicts": "random",
  "on-conflict": "{{on_conflict | default('index')}}",
  "conflict-probability": {{conflict_probability | default(25)}},
  "recency": {{recency | default(0)}}
},
{
  "name": "default",
  "operation-type": "search",
  "body": {
    "query": {
      "match_all": {}
    }
  }
},
{
  "name": "scroll",
  "operation-type": "search",
  "pages": 25,
  "results-per-page": 1000,
  "body": {
    "query": {
      "match_all": {}
    }
  }
}

And here is my challenges:

{
  "name": "append-no-conflicts",
  "description": "Indexes the whole document corpus using Elasticsearch default settings. We only adjust the number of replicas as we benchmark a single node cluster and Rally will only start the benchmark if the cluster turns green. Document ids are unique so all index operations are append only. After that a couple of queries are run.",
  "default": true,
  "schedule": [
    {
      "operation": "delete-index"
    },
    {
      "operation": {
        "operation-type": "create-index",
        "settings": {{index_settings | default({}) | tojson}}
      }
    },
    {
      "name": "check-cluster-health",
      "operation": {
        "operation-type": "cluster-health",
        "index": "logtest",
        "request-params": {
          "wait_for_status": "{{cluster_health | default('yellow')}}",
          "wait_for_no_relocating_shards": "true"
        }
      }
    },
    {
      "operation": "index-append",
      "warmup-time-period": 120,
      "clients": 8
    },
    {
      "name": "refresh-after-index",
      "operation": "refresh",
      "clients": 1
    },
    {
      "operation": "force-merge",
      "clients": 1
    },
    {
      "name": "refresh-after-force-merge",
      "operation": "refresh",
      "clients": 1
    },
    {
      "operation": "index-stats",
      "clients": 1,
      "warmup-iterations": 500,
      "iterations": 1000,
      "target-throughput": 90
    },
    {
      "operation": "node-stats",
      "clients": 1,
      "warmup-iterations": 100,
      "iterations": 1000,
      "target-throughput": 90
    },
    {
      "operation": "default",
      "clients": 1,
      "warmup-iterations": 500,
      "iterations": 1000,
      "target-throughput": 50
    },
    {
      "operation": "scroll",
      "clients": 1,
      "warmup-iterations": 200,
      "iterations": 100,
      "#COMMENT": "Throughput is considered per request. So we issue one scroll request per second which will retrieve 25 pages",
      "target-throughput": 0.8
    }
  ]
}

I got a report with only one line with the "index-append" task as | All | error rate | index-append | 0 | % |. And the race.json shows

{
    "task": "index-append",
    "operation": "index-append",
    "throughput": {
     "min": null,
     "median": null,
     "max": null,
     "unit": "docs/s"
    },
    "latency": {},
    "service_time": {},
    "error_rate": 0.0
},

However I want the detailed information like when I ran the default geonames track:

|   All |                                                 Min Throughput |           index-append |     7018.45 |  docs/s |
|   All |                                              Median Throughput |           index-append |     7126.09 |  docs/s |
|   All |                                                 Max Throughput |           index-append |     7209.44 |  docs/s |
|   All |                                        50th percentile latency |           index-append |     5526.45 |      ms |
|   All |                                        90th percentile latency |           index-append |     5756.71 |      ms |
|   All |                                        99th percentile latency |           index-append |     5983.96 |      ms |
|   All |                                      99.9th percentile latency |           index-append |     6121.97 |      ms |
|   All |                                       100th percentile latency |           index-append |     6209.29 |      ms |
|   All |                                   50th percentile service time |           index-append |     5526.45 |      ms |
|   All |                                   90th percentile service time |           index-append |     5756.71 |      ms |
|   All |                                   99th percentile service time |           index-append |     5983.96 |      ms |
|   All |                                 99.9th percentile service time |           index-append |     6121.97 |      ms |
|   All |                                  100th percentile service time |           index-append |     6209.29 |      ms |

danielmitterdorfer · June 18, 2019, 8:43am

Hi,

this sounds as if your benchmark finishes within the warmup period, i.e. your benchmark contains too little data. Rally will only report results after the warmup period therefore it does not show any results.

Daniel

jackgzj · June 18, 2019, 9:06am

Hi,
my track contains 1125199 docs with an uncompressed json of 1069880605 bytes. It's similar with default geonames track.
JackGzj

dliappis · June 18, 2019, 10:37am

Hi,

The geonames corpus isn't that large and since you aren't throttling the index-append operation via the target-throughput property, it will run as fast as the load driver, network capacity and target Elasticsearch cluster can achieve.

Observe that with your defaults you are specifying a warmup-period of 120s, so if the benchmark (in particular the index-append operation) finishes faster than that, it explains what you are seeing.

You can try reducing the warmup-time-period to something small to test things out and then consider alternatives e.g. using a track with larger corpus.

Cheers,
Dimitris

jackgzj · June 19, 2019, 3:04am

Thanks Dimitris! That exactly works and I have successfully run my tracks. Thanks again!

system · July 17, 2019, 3:04am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Run custom tracks get type error Elasticsearch rally	4	696	May 28, 2021
Benchmark for existing cluster Elasticsearch rally	4	3805	August 15, 2017
Rally race gets stuck on check-cluster-health Elasticsearch rally	5	1473	April 27, 2021
ESRally issue when running EventData track Elasticsearch rally	8	1076	April 27, 2022
Esrally can't running 1.3.0 Elasticsearch rally	6	1818	October 10, 2019

Esrally got "The benchmark ended already during warmup" when running custom track

Related topics