I understand how to create fields in Logstash however once they go into Elasticsearch and into Kibana I get .raw versions and the space difference between my indexes before the new ELK 2+ stack were significantly smaller. Is there a way through my logstash config i can save on disk space?
Go through your string fields and make sure they're analyzed in the best way. By default you'll get the field itself analyzed and a .raw subfield with the non-analyzed string, but this doesn't always make sense. Some fields are better left unanalyzed from the start (then you don't need a separate .raw subfield) and for others it doesn't make sense to have the .raw field. Adjust the index template as needed.
Copy the index template file that ships with Logstash (/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.0.5-java/lib/logstash/outputs/elasticsearch/elasticsearch-template.json or similar) to another location, edit it, and update the elasticsearch output's template option to point to your modified copy. The next time ES creates an index it should have your updated mappings. You can't change the existing mappings without reindexing.
This is awesome and exactly what i'm looking for! Thanks for the quick response. I found it but i'm not sure what i'm looking at can you point me to any documentation that i could teach myself what i'm looking at?
I've read all of these now and my question is there a place to set "doc_values" false to all fields? or do i have to wait for the fields to be created and then change them after the fact? If I have a bunch of fields do i define each filed in this template?
You can set default mappings per type which is how the default template does to enable doc values and add the .raw subfield. Which version of the elasticsearch output are you using? There have been some recent changes there.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.