Hello my name is João Sakai and I'm in the middle of the greatest challenge of my life;
"Using a logstash I have to index one CSV with 46 columns and 570 millions rows on elasticsearch as soon as possible"
Amazon Instance Type: m3.medium
Index Template Config:
As you can see I already do the optimization for " number of replicas: 0 " and " refresh interval: -1 ";
Amazon Instance Type: m3.2xlarge
I'm following the Indexing performance guide: https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html
My results were really disappointing!
Executing the index process in 1 hour the total of indexed documents was only 2 millions which leads me to think that there is something wrong about logstash configuration, elasticsearch configuration or anything else.
Is there some configuration wrong? Have I change the ec2 instances configurations?
Someone could give me some insights about how to index a large bulk of data?