How to index CSV files with big number of columns/fields

Hi all,

We have a lot of csv files with varied number of columns. The majority just have few columns but some have more than 1000 columns. At first we decided to consider each column as a field but the index.mapping.total_fields.limit will be violated by some files. We can increase this parameter, of course, but that will be something like the Sorites paradox. I mean if we increase this number to 1001 what if a new file has 1002 columns and in general if we increase it to N what if a new file has N+1 columns and so on. This problem is besides the performance issues that potentially will be caused by big number of fields.

I know we can use the Flattened field type but I guess the performance will be decreased compared to the column as field mode. So we caught half-way between the two options! I'm curious to know if there is a third option? And in general what is the best practice for such a cases in Elasticseach?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.