Not sure if this really belongs in Elasticsearch, or in Logstash, but since I'm using both, I'll put it in Elasticsearch.
I recently noticed our ELK cluster has thousands and thousands of fields named "column" followed by a number. I have no idea how they got there, but I want to get rid of them. I understand my only option may be a full reindex (pain) but if I'm wrong, please let me know? Also, does anyone know how these fields could've appeared in the first place?
We're feeding the cluster with Filebeats via Logstash and with data coming in via the Logstash syslog plugin.
I would suspect they could have been generated by a Logstash csv filter, as this by default auto-generates column names in that format whenever it comes across columns it does not have configured names for.
That was exactly it. A missing application_name field when CSV-parsing our PostgreSQL server replication logs. Thanks! That led me right to the solution.
Thank you! Good idea but we have way too much data to reindex in a timely manner. I've fixed the problem with the CSV parsing (described in another post in this thread). I presume that new indices will not have these extra columns after the old indices age out?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.