I am looking to use ES to index about 400 million records broken into 50 files with about 360 columns in each file.
Once indexed, the data will remain static. I am just looking for the best approach to load up this data initially.
The data is in CSV format. I signed up for Google Compute Engine and spun up 3 ES instances.
I attempted to use logstash locally on my mac-book and send the files to the remote ES server but I am only getting about 400 documents per second.
There has to be a better approach at loading this big data.