Speed Tuning ES 5.4 for Unit Tests (Ruby)

I have a bunch of unit tests where I test my implemented search logic, and the test setup is rather slow - ~1.1 seconds per test.

Basically what is happening is:

  • Build test data to feed to ES ( ~70ms)
  • Set up the ES index with the mapping I'll be using in production ( ~350ms)
  • Bulk insert the test data (~90ms)
  • Misc. other stuff
  • Wait for the test data to become available to the query before starting the test. This is done by performing an empty search and checking that the hits total matches the amount of documents inserted and sleeping a bit before trying again if we don't have all documents ready for reading (~600ms)

So there are two big time blocks here: Setting up the index from the mapping and waiting for documents (in my test setup currently 2) to become available. Since the tests are sequential it would be really nice to speed things up as much as possible. Parallelizing the tests is currently out of the question, since this is a legacy code base and I still have large "Here be dragons" areas on the map.

I've already had a look at the performance tuning tips in the documentation, and disabling write throtteling brought the document wait time down by about 50ms (from previously 620ms to 570ms).

But everything else I've tried didn't have much effect (reducing shard size, disabling replication, moving the storage path to a ram disk).

Is there anything else I can try to speed things up? For example something like in RDBMS where you throw data consistency over board by disabling transaction safety.

Read/Write concurrency is not an issue for my test setup. The test environment uses an NVMe SSD as a storage path and I have lots of ram available (>32GB) if that helps.

I just found out about the 'refresh' parameter. Setting this to true cut the waiting time for the data to become available down to ~50ms (and allowed me to get rid of the polling loop).

Still the question about speeding up the index creation. Maybe I'll just move it into a global setup phase and reuse the same set of indices for all my tests. Will have to test how fast an index can be flushed of all data to see if that is a viable alternative.

You can also change index settings and have only 1 shard no replica?

Yes, the times provided are are already with 1 shard and no replica.

I just tested setting up a global ES index pre-flight and refresh=true on the bulk inserts. The per-test setup time is now down to ~75ms and the one-off setup cost for the global test index isn't really noticeable in the total test runtime.

While not as clean as a new index for each test case, I can live with this setup. :slight_smile:

The problem with that strategy is mappings. But as soon as you are using the same mapping, that should work.

True, but since I'm planning on using only one mapping per index I'm guessing it should work.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.