Indexing 100 Million records from postgresdb

There is no way to answer that question. A t2.micro single-CPU cloud instance with 500 MB of heap can feed a billion records to elasticsearch, although it will probably take longer than a larger instance (unless your elasticsearch instance is the bottleneck).

If you are not using any filters and just pulling data out of jdbc I would not expect you to need a massive amount of memory.

I suggest you experiment. Measure the number of documents that you can index per second on a given set of hardware. Decide if it is fast enough. If not, start figuring out whether the bottleneck is the database, logstash, or elasticsearch and tune that subsystem. Then measure again and repeat. Nobody else can do this for you.

2 Likes