Percolation time varies a lot


I am just testing the below on a single 40 core machine. It has 171 percolation documents under a single index (index1).

I create a test document to percolate against them. The test document is not changing and i am testing via
curl -XGET localhost:9200/index1/som_maping/_percolate -d '[doc json]'

running the same doc at different times results the time taken to vary from 50 ms to 250 ms to get the result. Why is such huge variation for the same doc ?

There is no other operation running either on elasticsearch or on the machine. Any insight would be great !


How are you measuring the response time?

Would be good to know what is causing this. Did you measure network time or the took field that is included in the percolate response?

Using the "took" field in the response as well as have used the

curl -w @curl_format.txt

in the command to make sure network latency is not the major cause of this. network is not the problem, the percolation at ES side not consistent.

My first hunch would be to check your memory usage and load, although it sounds like you have plenty of resources. Since percolation is completely in memory and your index and document are both kept there, percolator is going to be much more dependent on your available resources and have little to no dependency on your disk.

Its a 48 core machine with about 32 gb ram , there are only 170 queries in it, in a single index ... no other operation is being performed on this machine except the curl request i make, the memory consumption is low and so is the cpu...

ES is given 8 GB to start with

I would expect that the first percolate requests are slower than requests made after the first requests. Is this also the case here? Do the request that take 250 ms occur just after ES has started or does this happen randomly while you're testing things out? Also how many requests did you run and how requests do you execute concurrently to test the percolator?

I just tried about 5-8 times and the time varied at random for about 2 and 5th and 8th request i think. it was not only on the first request. the first one was intact very fast, there is no concurrency here as i am testing via curl on the Terminal, I am planning to do some tests here regarding some other feature in ES and will try and see if i can formulate or post some sort of table listing the way time varied with sample queries and docs.

I think this invariance in query times is caused by noise from the fact that the jvm is 'cold'. If you run the query lets say a 100 times then I expect query time to be different.

Well what do u mean by JVM being cold, I am running the document samples in sequence one after another manually, So the queries should all be loaded in memory after the first run, requiring a 100 runs before consistent response times is just asking for too much i think, since in some of the actual scenarios you might have thousands of queries at the same time and such requirement might just make percolation api unusable.

No, I didn't mean that each query would need to be run a 100 times in order to have a consistent query time. The jvm jit compiler may not have yet optimized certain code paths. This can explain some invariance in query time when running the first search / percolate requests etc.

I'm just saying that you may need to run your query a couple of times before you start measuring response times. Maybe the 100 times I suggested are a bit too extreme. In general before benchmarking you should run a couple of (irrelevant) requests a number of times (lets say 10 times) and then start to run the requests you like measure query time for.