when do increase data with exists data, add for range statements,meet Execption when run rally test.
ClassCastException as follow:
warn 2019-03-26T14:32:20.001Z path: /geonames, params: {index=geonames}
java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:394) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:375) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.rest.action.admin.indices.RestCreateIndexAction.prepareRequest(RestCreateIndexAction.java:53) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:80) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.xpack.security.rest.SecurityRestFilter.lambda$handleRequest$0(SecurityRestFilter.java:61) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.3.2.jar:6.3.2]
The jinja2 loop you've defined will try to download source files like documents-2.json-0.bz2, documents-2.json-1.bz2 etc.
Since you are using the same base-url as the upstream geonames, you end up referencing files that don't exist, which you can easily check yourself with:
curl http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json-0.bz2
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>corpora/geonames/documents-2.json-0.bz2</Key><RequestId>037E3BB5A46782DE</RequestId><HostId>7oagYu1Ol6QaaxOINX+0VFpGzQ0o9enTiz/uDRIHsEDwTaNVQH3tqV+MWkuvi8/gvSHT4Bo6MXc=</HostId></Error>
I am surprised you didn't get a 404 while running this, as such files don't exist.
If you want to use a larger corpus by repeating geonames 10 times, you can simply download locally and concatenate it in a larger file and either provide the full path to this in source-file or upload it to some location you control and change base-url accordingly. Details in https://esrally.readthedocs.io/en/latest/track.html#corpora.
thanks for you feedback info.
according to esrally download policy. documents-2.json-xxx.bz2 will be download from asw bucket.
but i copy the documents-2.json-xxx.bz2 from the exists documents-2.json.bz2. esrally uncompress successfully to prepareing test.
I see. This is a very original but completely unsupported way of increasing the corpus size; for future compatibility you'd be better off following the approached I mentioned earlier.
Nevertheless I tried your geonames modification against 6.6.0 and 6.3.2 and didn't have any issues.
I used something like: esrally --distribution-version=6.3.2 --runtime-jdk=8 --track-path=~/.rally/geonames --challenge=append-no-conflicts-index-only.
Could you please past your Rally command? On top of other things I am interested in which pipeline you are using.
It seems you are using the benchmark-only pipeline, so Rally is benchmarking against a cluster it hasn't setup itself. Is your Elasticsearch version 6.3.2? Is it the the default distribution with security enabled?
As I said earlier I have been successful running a challenge from the modified geonames with >1 corpus against 6.3.2, so without additional information it's not clear what's going on.
Can you try running your track against Elasticsearch launched by Rally using something like (just make sure you have JAVA_HOME pointing to a java 8):
Additionally the cluster seems to be running on the same host (--target-hosts=localhost:9200), which is not a good practice for meaningful benchmarking results; the load driver should be kept separated from the ES nodes to avoid contention between each other.
1, my rally version is 1.0.4 latest.
2. jave_home point to 1.8.121
3.my target host is setup locally by myself.
4.target host elasticsearch version is 6.3.2.
Right, Rally will check out (when not specifying a track-path explicitly`) the right rally-tracks branch that corresponds to the detected Elasticsearch version. So you'll need to based your custom track on the right branch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.