I am running into an issue trying to work with a field when trying to create a visualization. In summary, I have a log field that is a ratio. Example values below:
.678534
-995849
.234752
When doing a unique count of this value, I am always getting 3 values returned which is incorrect. It appears these real values are being rounded to the nearest whole numbers, 1, 0, -1.
I had to change the field value for index patterns to include the additional decimal places. Previously this was rounding the field. That corrected my issue for displaying this field, but it does not appear to be translating to the visualizations.
The field value is set to "long". What I was describing was the default number setting:
0,0.[000] - this was rounding decimal places outside of the thousandth place I.E .99999 was returning a value of 1.
I adjusted this to 0,0.[000000] which corrected that issue. However, this is not translating to the visualizations. You can imagine how many unique values this would return with ratios going out to the .000000 place.
When attempting a unique count of this field, only 3 values come back (1,0,-1) and all real values (.685937, .234758, -678947) are being rounded and grouped under one of these whole numbers.
Here is the entry in the logstash template:
"pcr":{
"type":"long"
},
Sorry I am getting 404 errors trying to retrieve this through Dev Tools.
Went straight to our indexer and got this back:
{"logstash-bro-2019.02.12":{"mappings":{"doc":{"pcr":{"full_name":"pcr","mapping":{"pcr":{"type":"long"}}}}}}}
Hi Brandon, I believe the problem is that Elasticsearch coerces your values to integers because the field is mapped as a long. According to the coerce docs, you should be able to disable coercion by updating the mapping for your field:
Barring that, you could also try reindexing your data, by first creating a new index with a mapping of the pcr field to float or double, and then reindexing your data into this new index.
Thanks! A couple of things here I have questions about:
1-If other elements of elasticsearch (timelion, discover) are returning the correct values, why would this apply to just visualizations?
2-I have added the "coerce": false statement to the logstash template for pcr, restarted elastic and I am seeing no change. I was hoping it would show up when running curl 'localhost:9200/logstash-bro-2019.02.12/_mapping/field/pcr', however that coerce statement is not there.
Does that mean that statement is not being read on initialization or is that command not designed to return that value? I am trying to read mappings/_doc/properties/pcr via curl but I keep hitting errors.
Lastly, I would like to refrain from reindexing due to the time it takes.
Hi Brandon, I got a little more information from an Elasticsearch engineer who clarified things for me. I think I gave you slightly incorrect information before.
coerce only defines if ES will accept the data or not. So in your case, your field is mapped as long. coerce is on by default, which means that it will accept all values, including those that aren't long. When you set coerce to false, this means ES will reject values that are the wrong type. My apologies here, I misunderstood the docs and gave you the wrong information.
To clarify, Elasticsearch will still store all of the same original values that you send to it, even if they are a different data type than what is defined in the mapping. It's only during aggregations that the mapping data type affects how the numbers are interpreted, which explains why you can see all of the correct original float values in Discover, but when you try to create visualizations the numbers are not what you expect -- because Visualize uses Elasticsearch aggregations.
So, your solution involves setting the correct mapping on a fresh index, whether that means reindexing existing data or waiting for a new index to be created with new daily data. Make sure that the mapping is set to double or float on the new index and this should solve your problem.
Thanks for the update. So, I have updated my logstash-template.json file to reflect the new mapping. New indices are created daily, however today's index is still mapping pcr as "long".
Is there something I am missing when it comes to mapping changes? I thought changing this template, restarting services, and writing a new index would make the change.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.