Hello,
I have a use case where I need to add documents , perform query/ies over that document and delete it as once I have the result the document will no longer needed on the cluster.
It's a php app that search for almost 1.5k keywords in each document . To do that I did several approaches looking for the best performance.
- One query per keyword and perform request with "lazy" in a concurrent way.
Result --> When high number of request per second, the elastic cpu flies and sometimes is not capable to manage the queued requests. - Multiple request. By using msearch and splitting into chunks of 100 checking of each iteration if i have results and if I have, stop the followings requests.
- By generation a query with 1.5 clauses (clause limit is 1024 ) so I did split again into 100 and follow the same approach of before looking on each iteration for results and avoiding further request if result found.
The third one is the one which better results is giving us allowing perform 3k/min request to our app and 7k/s requests to elastic .
However the cpu usage on elastic has not a regular behaviour and sometimes flies to 100% while the normal usage us around 40%.
I´m looking for some help or maybe another approach to perform this use case.
My elastic cluster is located in amazon with ES 6.3 and c4.2xlarge.elasticsearch each node.
Any ideas?
Thanks in advance!