Test Harness for ElasticSearch


(pulkitsinghal) #1

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(Ronak Patel) #2

I built my own to do this using a local node to handle basic integration
testing.
You can probably fire up your backend webapp (the one that talks to ES) and
use something like Apache JMeter to handle load testing against your webapp.

On Wednesday, February 29, 2012 11:17:13 AM UTC-5, pulkitsinghal wrote:

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(Nick Dimiduk) #3

I just threw together a simple multi-threaded load script. It makes
assumptions about our deployment all over the place, but does the job. I
looked briefly at jmeter and will likely move to that tool when i find a
few minutes to study it's use.

-n

On Wed, Feb 29, 2012 at 8:17 AM, pulkitsinghal pulkitsinghal@gmail.comwrote:

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(Otis Gospodnetić) #4

Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Mar 1, 12:17 am, pulkitsinghal pulkitsing...@gmail.com wrote:

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(pulkitsinghal) #5

Yup JMeter seems to be the goto solution, I've started on it. But if there
is any other advice, please keep the comments coming :slight_smile:

On Wed, Feb 29, 2012 at 11:57 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Mar 1, 12:17 am, pulkitsinghal pulkitsing...@gmail.com wrote:

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(Michael Sick) #6

I've used SoapUI a good deal for creating SOAP based test clients and the
same organization sponsors LoadUI which leverages your base tests for load
testing & reporting. They claim decent support of REST. Can't vouch for it
but it's probably worth a look.

On Thu, Mar 1, 2012 at 9:06 AM, Pulkit Singhal pulkitsinghal@gmail.comwrote:

Yup JMeter seems to be the goto solution, I've started on it. But if there
is any other advice, please keep the comments coming :slight_smile:

On Wed, Feb 29, 2012 at 11:57 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis

Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html

On Mar 1, 12:17 am, pulkitsinghal pulkitsing...@gmail.com wrote:

Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :slight_smile:


(Jan Fiedler) #7

I have been using XLT (
http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
load testing. Its especially nice if your home (and preferred ES API) is
Java as you write your load scripts in Java as JUnit test.


(pulkitsinghal) #8

I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:

  • JMeter
  • SoapUI
  • XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :slight_smile:

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler fiedler....@gmail.com wrote:

I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
load testing. Its especially nice if your home (and preferred ES API) is
Java as you write your load scripts in Java as JUnit test.


(pulkitsinghal) #9

One criteria for testing is to assume that a certain number of users that
are searching against a particular field (for example product_name) in
parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique
terms that are seen during indexing from a particular field ... and use
this as a dictionary in a JMeter test to assign all unique search words to
different threads/users and figure out what the performance would be like
in this scenario ... where users don't have the benefit of searching for
similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module that
does something similar but I'm wondering:

  1. Does ElasticSearch already has this capability baked-in somewhere and it
    is as easy as making the right call? If so, please point me to it.
  2. If I used Lucene's SpellChecker module to point to a ES built index then
    can I expect to be able to simply read it w/o any locking issues while ES
    is running?
  3. Rather than locating, building a path & feeding it in as a Directory
    into SpellChecker ... would there happen be a better integration point from
    which to leverage the ES built indices in code?

Thanks!

  • Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal pulkitsinghal@gmail.comwrote:

I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:

  • JMeter
  • SoapUI
  • XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :slight_smile:

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler fiedler....@gmail.com wrote:

I have been using XLT (
http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
load testing. Its especially nice if your home (and preferred ES API) is
Java as you write your load scripts in Java as JUnit test.


(Otis Gospodnetić) #10

Hi,

If your goal is really to build a dictionary of unique terms, then I would
not even think about Lucene Spellchecker because I can think of 2 simple
ways of getting this dictionary:

  1. Look for "words" file on any Linux machine and use that. Here is from
    my local machine:
    $ tail -2 /usr/share/dict/words
    Ă©tude's
    Ă©tudes
    $ wc -l /usr/share/dict/words
    98569 /usr/share/dict/words

You could point JMeter to that.

  1. Just do a : style query of scan against your ES cluster, return some
    text field, store it, and then parse it with something to put a word per
    line to feed to JMeter.

Also, non-repeating queries is not common, so make sure this is really what
would happen in your env.
Plus, using only terms from your index may also not be realistic - people
sometimes use words that do not exist in the index (and get 0 hits).

Otis

Hiring ElasticSearch Engineers World-Wide --

On Thursday, March 8, 2012 1:48:48 PM UTC+8, pulkitsinghal wrote:

One criteria for testing is to assume that a certain number of users that
are searching against a particular field (for example product_name) in
parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique
terms that are seen during indexing from a particular field ... and use
this as a dictionary in a JMeter test to assign all unique search words to
different threads/users and figure out what the performance would be like
in this scenario ... where users don't have the benefit of searching for
similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module
that does something similar but I'm wondering:

  1. Does ElasticSearch already has this capability baked-in somewhere and
    it is as easy as making the right call? If so, please point me to it.
  2. If I used Lucene's SpellChecker module to point to a ES built index
    then can I expect to be able to simply read it w/o any locking issues while
    ES is running?
  3. Rather than locating, building a path & feeding it in as a Directory
    into SpellChecker ... would there happen be a better integration point from
    which to leverage the ES built indices in code?

Thanks!

  • Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal pulkitsinghal@gmail.comwrote:

I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:

  • JMeter
  • SoapUI
  • XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :slight_smile:

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler fiedler....@gmail.com wrote:

I have been using XLT (
http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
load testing. Its especially nice if your home (and preferred ES API) is
Java as you write your load scripts in Java as JUnit test.

On Thursday, March 8, 2012 1:48:48 PM UTC+8, pulkitsinghal wrote:

One criteria for testing is to assume that a certain number of users that
are searching against a particular field (for example product_name) in
parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique
terms that are seen during indexing from a particular field ... and use
this as a dictionary in a JMeter test to assign all unique search words to
different threads/users and figure out what the performance would be like
in this scenario ... where users don't have the benefit of searching for
similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module
that does something similar but I'm wondering:

  1. Does ElasticSearch already has this capability baked-in somewhere and
    it is as easy as making the right call? If so, please point me to it.
  2. If I used Lucene's SpellChecker module to point to a ES built index
    then can I expect to be able to simply read it w/o any locking issues while
    ES is running?
  3. Rather than locating, building a path & feeding it in as a Directory
    into SpellChecker ... would there happen be a better integration point from
    which to leverage the ES built indices in code?

Thanks!

  • Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal pulkitsinghal@gmail.comwrote:

I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:

  • JMeter
  • SoapUI
  • XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :slight_smile:

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler fiedler....@gmail.com wrote:

I have been using XLT (
http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
load testing. Its especially nice if your home (and preferred ES API) is
Java as you write your load scripts in Java as JUnit test.


(system) #11