Thanks for your explanation. Does the scripted fields only work with JSON objects? For example the value in my "log" key is "I, [2018-02-28T14:50:57.606764 #9] INFO -- : [71f1707b-f78b-4112-a7ae-4437b67b74b1] {"user_id":5, ....}". There is a timestamp before the json object for this key and when I am tried to parse it out using scripted fields I get an "non-array type" error.
Well the document is in Elastic so it is a JSON object. After that "Painless" is a programming language and you can use it to manipulate the data anyway you want. But you should probably open a thread about how to use script_fields as I am not very familiar with the details. Just a couple basic ones and found for my needs it was too slow and pre-parsing was easier and faster
We should also mention that unique counts in elastic search (for indexes with multiple nodes/shards) are just an estimate. More information here
Logically it makes sense - cause the only way to determine if a value is truly unique is to have all of the data in one spot for analysis - this is an example of the compromise between speed and accuracy.
This is one thing people moving from relation databases often struggle with (and when users with no understanding of how elastic search is working under the hood see "unique count" in Kibana they assume its a relational DB level of accuracy).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.