Increase track data with for range statement,but meet java.lang.ClassCastException

Kevins_Si · March 26, 2019, 8:35am

when do increase data with exists data, add for range statements,meet Execption when run rally test.

ClassCastException as follow:

warn 2019-03-26T14:32:20.001Z  path: /geonames, params: {index=geonames}
java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
	at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:394) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:375) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.rest.action.admin.indices.RestCreateIndexAction.prepareRequest(RestCreateIndexAction.java:53) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:80) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.xpack.security.rest.SecurityRestFilter.lambda$handleRequest$0(SecurityRestFilter.java:61) ~[?:?]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.3.2.jar:6.3.2]

track.json modified as follow:

{% import "rally.helpers" as rally with context %}
{% set index_count = 10 %}
{
  "version": 2,
  "description": "POIs from Geonames",
  "data-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames",
  "indices": [
    {
      "name": "geonames",
      "body": "index.json"
    }
  ],
  "corpora": [
    {
      "name": "geonames",
      "base-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames",
      "documents": [
        {% set comma = joiner() %}
        {% for item in range(index_count) %}
        {{comma()}}
        {
          "source-file": "documents-2.json-{{item}}.bz2",
          "document-count": 11396505,
          "compressed-bytes": 264698741,
          "uncompressed-bytes": 3547614383
        }
        {% endfor %}
      ]
    }
  ],
  "operations": [
    {{ rally.collect(parts="operations/*.json") }}
  ],
  "challenges": [
    {{ rally.collect(parts="challenges/*.json") }}
  ]
}

dliappis · March 26, 2019, 1:08pm

Hello,

The jinja2 loop you've defined will try to download source files like documents-2.json-0.bz2, documents-2.json-1.bz2 etc.

Since you are using the same base-url as the upstream geonames, you end up referencing files that don't exist, which you can easily check yourself with:

curl http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json-0.bz2
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>corpora/geonames/documents-2.json-0.bz2</Key><RequestId>037E3BB5A46782DE</RequestId><HostId>7oagYu1Ol6QaaxOINX+0VFpGzQ0o9enTiz/uDRIHsEDwTaNVQH3tqV+MWkuvi8/gvSHT4Bo6MXc=</HostId></Error>

I am surprised you didn't get a 404 while running this, as such files don't exist.

If you want to use a larger corpus by repeating geonames 10 times, you can simply download locally and concatenate it in a larger file and either provide the full path to this in source-file or upload it to some location you control and change base-url accordingly. Details in https://esrally.readthedocs.io/en/latest/track.html#corpora.

Rgs,
Dimitris

Kevins_Si · March 26, 2019, 1:36pm

thanks for you feedback info.
according to esrally download policy. documents-2.json-xxx.bz2 will be download from asw bucket.
but i copy the documents-2.json-xxx.bz2 from the exists documents-2.json.bz2. esrally uncompress successfully to prepareing test.

dliappis · March 26, 2019, 3:15pm

I see. This is a very original but completely unsupported way of increasing the corpus size; for future compatibility you'd be better off following the approached I mentioned earlier.

Nevertheless I tried your geonames modification against 6.6.0 and 6.3.2 and didn't have any issues.

I used something like: esrally --distribution-version=6.3.2 --runtime-jdk=8 --track-path=~/.rally/geonames --challenge=append-no-conflicts-index-only.

Could you please past your Rally command? On top of other things I am interested in which pipeline you are using.

Kevins_Si · March 26, 2019, 6:01pm

Thank you for your posted.
i run commands as normal use, such as:

esrally race --pipeline=benchmark-only --target-hosts=localhost:9200 --track-path=geonames_large --client-options="xxxxxxxxx" --challenge=append-no-conflicts --track-params="bulk_size:10000,clients:200" --report-file=./markdown.md

i'm not change any pipeline configuration or other params.

dliappis · March 27, 2019, 8:18am

What's your version of Rally? esrally --version

It seems you are using the benchmark-only pipeline, so Rally is benchmarking against a cluster it hasn't setup itself. Is your Elasticsearch version 6.3.2? Is it the the default distribution with security enabled?
As I said earlier I have been successful running a challenge from the modified geonames with >1 corpus against 6.3.2, so without additional information it's not clear what's going on.
Can you try running your track against Elasticsearch launched by Rally using something like (just make sure you have JAVA_HOME pointing to a java 8):

esrally --distribution-version=6.3.2 --runtime-jdk=8 --track-path=<your_custom_geonames_track> --challenge=append-no-conflicts-index-only

Additionally the cluster seems to be running on the same host (--target-hosts=localhost:9200), which is not a good practice for meaningful benchmarking results; the load driver should be kept separated from the ES nodes to avoid contention between each other.

Kevins_Si · March 27, 2019, 1:24pm

1, my rally version is 1.0.4 latest.
2. jave_home point to 1.8.121
3.my target host is setup locally by myself.
4.target host elasticsearch version is 6.3.2.

dliappis · March 27, 2019, 4:15pm

Can you try running your track against Elasticsearch launched by Rally using something like (just make sure you have JAVA_HOME pointing to a java 8):

esrally --distribution-version=6.3.2 --runtime-jdk=8 --track-path=<your_custom_geonames_track> --challenge=append-no-conflicts-index-only

and report back?

Kevins_Si · March 28, 2019, 1:24pm

oaha...i get , if use master branch copied config. add for loop statements. to test ver. 6.3x, the errors ocurred.

i compared same track with branch master and 6, branch 6 will be ok testing (add my for loop), but master branch is not.

dliappis · March 28, 2019, 2:34pm

Right, Rally will check out (when not specifying a track-path explicitly`) the right rally-tracks branch that corresponds to the detected Elasticsearch version. So you'll need to based your custom track on the right branch.

system · April 25, 2019, 2:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk update request results in java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map Elasticsearch	6	1655	July 6, 2017
Esrally creat index error,class_cast_exception Elasticsearch	1	106	January 30, 2024
Rally gets stuck on check-cluster-health Elasticsearch rally	7	1128	March 4, 2021
Class cast exception while updating documents Elasticsearch	3	815	January 23, 2019
ClassCastException MappingMetaData - Java Elasticsearch language-clients	2	356	October 6, 2020

Increase track data with for range statement,but meet java.lang.ClassCastException

ClassCastException as follow:

track.json modified as follow:

Related topics