when developing a new logstash filter, I initially by mistake got fields mapped as string rather than number, now I want to search out all the docIDs where specific fields are of type string rather than numbers so I can search my docs from Kibana and avoid this warning from Kibana:
Mapping conflict! 6 fields are defined as several types (string, integer, etc) across the indices that match this pattern. You may still be able to use these conflict fields in parts of Kibana, but they will be unavailable for functions that require Kibana to know their type. Correcting this issue will require reindexing your data
Q is just howto do this, hints are much appreciated!
you don't need to find the individual doc ids but rather the index names as all documents in an index will be indexed using the same mapping. Hence, you need to use the mapping API to find out which indices are affected.
Here is an example: Say, your index pattern is "logstash-*", then you can find the mappings of all indices with
GET /logstash-*/_mapping
Now, there is a nice little utility called gron which you can use to grep for the fields that you're interested in. Say, the field is called foo, then you can issue:
From which you can see that the affected index is logstash-2016. If it's just a couple of indices this is probably manageable, otherwise you should postprocess the result with ask or the like.
You can also try jq but it was easier for me with gron in this case.
When you know the affected indices, you can then create a new index with a proper mapping and reindex all documents.
The reason for finding the docs was I just wanted to remove the few initially created doc which has the field wrongly mapped as strings. Also it puzzles me that logstash grok filters matching %{INT:field-name} ends up as strings. I fixed this with a logstash ruby plugin section doing v_to_i on such fields.
You have dynamic mapping enabled, yes. Below properties you see which properties are defined in the mapping for the index collectd-2016-08-26.
All documents in an index will have the same mapping (except for very few exceptions see user docs).
I cannot really say much about that but maybe somebody in the Logstash forum can clear up this confusion.
If you're on Elasticsearch 2.3 or above you can use the reindex API. You should also consider using aliases, so you can change the underlying index name transparently for the application (see the Definitive Guide).
Only get the few above properties when asking for _mapping (all output shown) though I got many more fields in the index. How could I change the mapping for an existing index, pointers?
I managed to create a new index with correct mapping and reindex data into to this, and removed old index and aliased it's name to the new created index, thanks.
Also it turned out there was a bad date plugin format specifier in my new logstash filter, so @timestamp became wrong and thus docs were put in wrong indices
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.