How to ingest very large json file into elastic search quickly

leandrojmp · June 21, 2024, 4:37am

What options have you changed? In your other question about a template, your template didn't change the default refresh_interval which can heavily impact on performance.

What is the refresh_interval that you set for your index? Also, did you create a mapping for your data or is using the dynamic mapping? This also impacts on performance.

Please share the specs of your Elasticsearch, the elasticsearch.yml and the tuning options you changed, without knowing what you already did it is complicated to provide any feedback.

Also, how many TBs are you talking about in this file? 10s TBs? 100s TBs?

Another thing, is this mount path a network share?

mountPath: /mnt/logstash/

This is no trivial issue, there are many variables, disk speed, tuning configurations, network speed etc.

I do not use Logstash to read files anymore, I normally put all my data in some Kafka clusters and have multiple logstash reading from the Kafka topics, but to get your data into Kafka you would also need some tool to read the file and put it on the topics.

Not sure if this is justified for a one time ingestion or even if this would change anything.

Topic		Replies	Views
How to ingest large files in ELK stack Logstash	1	142	June 24, 2024
Ingesting large number of files in a directory using logstash Logstash ingest-pipeline	4	931	April 21, 2021
Fastest way to ingest CSV's with logstash to elasticsearch Logstash	9	473	June 8, 2023
Tuning to handle extreme initial ingestion conditions (with logstash) Elasticsearch	4	993	July 27, 2019
How to Ingest MultiLine Json file into ElasticSearch using Logstash Pipeline Logstash	1	212	March 27, 2023

How to ingest very large json file into elastic search quickly

Related topics