I think a basic tool that is relatively independent of cluster specifics
could be pretty useful.
I'm imagining a tool that allows you to do load testing against any cluster
you point it at to:
Test indexing by selecting the complexity of data objects you're interested
in - ie, create X test indices with X shards and Y replicas each, and send
either a custom object with fields that could be defined for length and
type of random variables, or basic objects at various sizes (an example
tweet, log record, simple data point, etc).
As a better example, if I wanted to see how my system did with objects that
looked like {"test1":18, "test2":[{"test3":"XnksdjfknSeorjOJosdimkn
skjnfds sdfsidfun ifsdfosdmfo"}, {"test3":"dslkmlkdsfnUFIDNSiufndsfn
DFISnu"}]}, we could just pass something like a mapping,
{"test1":{"type":"int", "maxSize":10000000,
"minSize":-1029420},"test2":{"type":"nested", "minSize":0, "maxSize":20,
"properties":{"test3":{"type":"string", "maxSize": 160, "minSize":20}}}}
Random objects would be generated according to the specifications of the
template. Maybe you could also pick a type "english", "french", "russian",
etc, to generate strings that are actually language based on a dictionary
of terms. (or be able to define custom "types" pointing at text files).
You could then indicate how many objects you want created in any given
index, also a min/max range, the number of workers writing to any given
index, the total number of workers you want writing data, the total number
of indexes, etc.
Data could all be sent over the native API, I'm a Python guy so I'll say
Python, or over HTTP using something like the requests module. This could
allow for interesting comparisons of the APIs.
Test queries by doing pretty much the same thing. Track the query response
times per worker, and other relevant stats. Have definable "max requests
per second per worker", maybe, so you can replicate your worst case user
behavior in each process.
This would be step 1 of the process, step 2 would be developing something
so that a central test system could allocate testing jobs and collect stats
across a number of client test systems. So I'd set up a test service on
host A, and test clients on hosts B and C. B and C would be sent job
properties by A, B and C would then launch, track their own stats, and send
them to A to aggregate. Scale out to as many systems as you like.
This is just a first pass at the idea, there may be some dumb mistakes in
logic or oversights about test cases, but I think an app like this could be
pretty useful. Heck, you could have a GUI on it, or just make it run off a
yaml file or something.
If it gets into my head enough maybe I'll try to write this up, though like
I said, it'd be in python since that's my language of choice. So it
wouldn't be as optimal a testing platform as a native Java app, I guess,
but still useful as a proof of concept.
On Thursday, January 30, 2014 4:41:06 PM UTC-8, Josh Harrison wrote:
In our case, we're just interested in query stress testing. We've got a
web app that queries our indexes that are organized based on weeks of the
year, with a bunch of aliases making it so specific portions of the data
can be reached easily. Questions about scaling the app have come up. In our
case, that means testing through the app itself, which so far only makes
queries. I figure we should load test our cluster directly too, so we can
see if there is a bottleneck somewhere in the app, if any eventual
bottlenecks are on the cluster itself.
So far I haven't been able to really max out the indexing rate on a system
that is adequately equipped with resources, that I can tell. I've had 32
sub-process Python workers happily sending, I think, ~5+ million records an
hour to our cluster with no problem in indexing speed or other response
time when backloading some data.
My current strategy is to get the ugliest heavy queries the application
runs and simply use ABS or something similar to run queries over http with
variables that are in a reasonable range. If I can make my cluster crash by
doing that, I know that'll be my upper limit!
On Thursday, January 30, 2014 3:59:19 PM UTC-8, Jörg Prante wrote:
Just a few questions, because I'm also interested in load testing.
What kind of stress do you think of? Random data? Wikipedia? Logfiles?
Just query? What about indexing? And what client? Java? Other script
languages? How should the cluster be configured, one node? two or more
nodes? Index shards? Replica? etc. etc.
There are so many variants and options out there, I believe this is one
of the reason why a compelling load testing tool is still missing.
It would be nice to have a tool to upload ES performance profiles to a
public web site, for checking how well an ES cluster is tuned in comparison
to others. A measure unit for comparing performance is needed to be
defined, e.g. "this cluster performs with a power factor of 1.0, this
cluster has power factor 1.5, 2.0, ..."
That's only possible when all software and hardware characteristics are
properly taken into account, plus "application profiles" for a typical
workloads, so it can be decided which configuration is best for what
purpose.
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ddd4083-5507-4dca-b0fe-43eb710e7d76%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.