Thousands of fields named column with a number

Not sure if this really belongs in Elasticsearch, or in Logstash, but since I'm using both, I'll put it in Elasticsearch.

I recently noticed our ELK cluster has thousands and thousands of fields named "column" followed by a number. I have no idea how they got there, but I want to get rid of them. I understand my only option may be a full reindex (pain) but if I'm wrong, please let me know? Also, does anyone know how these fields could've appeared in the first place?

We're feeding the cluster with Filebeats via Logstash and with data coming in via the Logstash syslog plugin.

You could use the reindex API to reindex the data and remove the field(s):

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/docs-reindex.html

I would suspect they could have been generated by a Logstash csv filter, as this by default auto-generates column names in that format whenever it comes across columns it does not have configured names for.

1 Like

That was exactly it. A missing application_name field when CSV-parsing our PostgreSQL server replication logs. Thanks! That led me right to the solution.

Thank you! Good idea but we have way too much data to reindex in a timely manner. I've fixed the problem with the CSV parsing (described in another post in this thread). I presume that new indices will not have these extra columns after the old indices age out?