Indexing 100 Million records from postgresdb

Badger · July 16, 2019, 4:33pm

There is no way to answer that question. A t2.micro single-CPU cloud instance with 500 MB of heap can feed a billion records to elasticsearch, although it will probably take longer than a larger instance (unless your elasticsearch instance is the bottleneck).

If you are not using any filters and just pulling data out of jdbc I would not expect you to need a massive amount of memory.

I suggest you experiment. Measure the number of documents that you can index per second on a given set of hardware. Decide if it is fast enough. If not, start figuring out whether the bottleneck is the database, logstash, or elasticsearch and tune that subsystem. Then measure again and repeat. Nobody else can do this for you.

Topic		Replies	Views
Transform large file from postgresql to elasticsearch Logstash elastic-stack-sql	6	1268	January 15, 2020
Indexing stops after 1000 records Logstash	1	615	July 6, 2017
Indexing 570 millions rows Elasticsearch	4	3675	July 5, 2017
Logstash5.5.0 Out of memory issue Logstash	3	668	October 8, 2017
Logstash limitting ElasticSearch heap Elasticsearch	5	448	July 6, 2017

Indexing 100 Million records from postgresdb

Related topics