Index a large dataset into elasticsearch

Denny · April 6, 2016, 1:27pm

Hello,

I've been playing around with elasticsearch and graph for a while and a lot is very promising!

I'm stuck when it comes to parsing JSON-datasets containing millions of rows with the bulk api. I have to add a header to each line which is doable when the dataset isn't big so I can do it manually, but if it's a large dataset then I don't know the options on how to do it. Can't find it in the O'Reilly book and I've googled a lot but I haven't found a definitive answer on how experienced Elasticsearch users deal with indexing large datasets with the bulk api and having to add headers to each line. Do you use programming languages or any other solutions?

Any help would be appreciated. I can read some code but I'm not a programmer by the way.

Christian_Dahlqvist · April 6, 2016, 1:31pm

You can use Logstash for this. It has a codec as well as filter for parsing JSON data, especially if it is one JSON object per line, and should be relatively easy to set up.

Denny · April 6, 2016, 1:33pm

Sounds good, thanks for your quick reply! I will give it a go.

Topic		Replies	Views
Import json files to elasticsearch using logstash Logstash	1	1517	February 21, 2019
Getting started with Elastic Search! Elasticsearch	7	491	February 8, 2020
Getting data into Elasticsearch? Elasticsearch	7	2552	July 6, 2017
Index json files in bulk Elasticsearch	5	848	February 14, 2019
How to add thousand of objects Elasticsearch	3	192	August 28, 2023

Index a large dataset into elasticsearch

Related topics