Best way to Index and Map large csv files with Python into Elasticsearch

Hey guys,

I have a two questions regarding indexing large csv files into Elasticsearch and mapping the data:

  1. Should I first create an index and mapping and then send the csv data to Elastic?
    If yes, could someone provide a code snippet / template for creating an index and mapping (corresponding to the csv header) with the elasticsearch-py package?

  2. I use helpers.bulk() to index my csv data, but I am not able to map the data with Python then in Elasticsearch. Could someone provide also a code snippet for indexing csv files into a specifical index and that the csv data corresponds to the mapping of this index?

Thanks in advance!

Hi,

You can also use template to define your mapping so you don't need to care about creating index and mapping. Index and mapping will be created when you save your csv file.

https://www.elastic.co/guide/en/elasticsearch/reference/7.1/indices-templates.html

https://docs.python.org/3/library/csv.html <--- check the class csv.DictReader

As it return a dict you can build your bulk then send to elastic.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.