Performance when inserting/deleting many documents

Hello:

We have a very old Elastic 2.2.0 instance running in production that contains approximately 26 million documents. We also insert 230k (yes - thousands) of documents a week and mark for deletion 30k documents a week. The current solution uses Elastic UberJar that serves as a secondary Elastic server for receiving the instructions on inserting and deleting documents and then it syncs with the main Elastic Server.

I would like to know if upgrading to the latest Elastic and directly perform all transactions (means eradicating UberJar and stop running it on the background thread) thru High-Level Rest Client would be ok. That would mean that we'd have 90 inserts/deletes a minute. We still would have to have a single Elastic instance ( due to data center capacity). We cannot get to the cloud (fast enough at least) due to HIPPA and PHI compliance.

Many many thanks in advance for your advise.

Kindly,

Roman

ES 7.x is definitely capable of handling a few hundred thousand document operations per week - there are plenty of examples of handling millions or 10s of millions on such a timeframe. The thing is, will it manage with your particular set up, disk I/O, hardware resources, and so forth. You would need to test this to be 100% sure. Generally, there have been many, many efficiency and speed improvements since 2.2.0.

If you were to set up on-premise ES 7.9 and loaded a data dump into it, is there a way for you to asynchronously send all your requests to it as well? That'd give you a good idea of how it performs without blocking your app.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.