Currently I have a search with a match_all query failing due to the logs stating the operation is being conducted against all indexes. The track has been configured to use a single index i.e. not '_all' My assumption was that all operations defined would be against the named index in the track.json
Is there any reason/missing config as to why a search would not be targeting a single index? I was expecting the query URL to be /<index>/_search. Just for background my custom runner for msearch does work as expected, using the URL /<index>/_msearch
Unfortunately I am unable to copy and past actual logs and files, but can write out any pertinant information that helps.
Rally logs: (How I was able to determine the unexpected URL being used)
Provided there is no bug in Rally, that's how it is supposed to work, i.e. your assumption is correct.
If there is only a single index, it will use it as the default index when running queries (and similarly for the document type). _all should only appear if there are multiple indices declared but you did not set one when defining the match-all query with the index property.
Would you be able to share the complete track.json (maybe with anonymised index / operation / parameter names), e.g. via a private message if you cannot post it publicly?
You wanting to see the tracks.json file made me test a bare bones version from the adding_tracks doc page. That works as expected. The tracks.json file I was using was based on Rally 0.7.4 and the nesting is now different for 0.8.0 but on the whole worked with my in my dev environment. Perhaps enforcing a version or schema check could help for anyone else in the future?
All good now though, just porting my config into a new tracks.json formatted file.
That is not good. There were a few changes in the track format between 0.7.4 and 0.8.0 but none of them should have (intentionally) caused this behaviour. If you could show the version before and afterwards that might be helpful in understanding why it has caused this behaviour.
There is something like that in place already but I frowned upon being to strict about it so far in the interest of backwards compatibility. I might need to revisit this decision in the future though. Also, we have a JSON schema that will check your Rally track but sometimes something slips through... . Thanks for your feedback.
Looks like the track.json was a bit of a red herring as to the problem. It was actually down to not specifying an index on the parameter source.
From the snippet below you have an index attribute that I had not set Never mind, got there in the end, I'm only suprised that I hadn't added it in the first place. Anyway, not setting it means you get the _all behaviour that kind makes sense I guess!
def random_profession(indices, params):
# you must provide all parameters that the runner expects
return {
"body": {
"query": {
"term": {
"body": "%s" % random.choice(params["professions"])
}
}
},
"index": None,
"type": None,
"use_request_cache": False
}
Thanks for pointing that out Andrew. Based on your feedback I have now changed the example for parameter sources in the docs to demonstrate how set a default index parameter or make it overridable by the user. This should avoid this problem in the future.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.