I am trying to set up visualizations by metrics that show a count of unique objects shared between JSON files that are in the same index. I have been able to achieve this by using metrics formulas. My problem persists because the data I am trying to look at is contained within an array of objects in the JSON with many similar endpoints (all the data I am accessing is under entity.detail.valueString). These entity.detail.valueStrings can be many UUIDs, names, numbers, etc. With other visualizations I have been able to say that I only want to look at a certain type using regex, say if I wanted to only look at UUIDs I could do that. But in metrics for some reason, this isn't available, am I overlooking something (being new to Kibana), or has anyone found a workaround for this?
Hi Marcus,
could you place a screenshot of what you are trying to achieve? I think using filter might be enough for you, but I don't understand how you are trying to achieve your goal (where are you placing your regex, do you use Lens or something else).
I'll start off by saying I am using Lens for my visualizations, and within Lens for some of the different visualizations, there is a place to include or exclude certain values in advanced settings. This is a field labeled "Include Value" in which I could flip a switch that says "use regular expression" and then use regex for filtering, for instance, I could input "Dr.*" in the field to include only terms beginning with "Dr."
What I am working with is lots of JSON with entity, an array of objects that looks like this when viewed through the Elastic index page.
The formula I'm using for my metrics is this. unique_count(entity.detail.valueString.keyword, kql='id.keyword : "CDS-Call" ') - unique_count(entity.detail.valueString.keyword) + unique_count(entity.detail.valueString.keyword, kql='id.keyword : "Application-Start" ')
It returns the count of overlapping unique items in the entity.detail.valueString.keyword field between JSONs with id.keyword : "CDS-Call" and id.keyword : "Application-Start". As I already stated this formula is working. Note this entire visualization is already filtered so it only looks at JSONs with id.keyword : "CDS-Call" or "Application-Start" which is why the middle unique_count function does not have any kql.
In my situation, I know filtering does not work. I have already tested it with multiple different filters, both using a regexp query and a term query. I believe it doesn't work because filters I think just sort JSONs on whether something exists, it doesn't filter out unwanted terms in entity.detail.valueString.keyword like the Include Values field mentioned earlier does. My entity arrays will always have a correlationId, which is what I am trying to track, but they will also always have other values in other objects' valueString, such as numbers, names, etc.
This visualization is a metric, which is why I can use the formula which is very necessary for my use case, however, it doesn't have an option like in other visualizations to include or exclude values. It only has a "Filter by" field which I have tried to use but I believe it has the same problem with the other filters is that it just filters wether things exist or not.
Let me know if you need other information, I think this mostly explains the problem I am having but could have easily left something out on accident or without knowing to put it in. I would like to include more screenshots for you but my trust level is too low at the moment as I am a new user.
So the problem is that all the array values are counted instead of just the ones you filter by? I will try to reproduce it and see if this is a bug.
In the meantime, when it comes to include values this is a setting only for top values. That's why you don't see it in the options for your metric but maybe it could still help you. If you use a metric visualization, create a regular unique count(entity.detail.valueString.keyword) metric and then in break down by use top values and choose the values you want to include. You could later use 'collapse by' setting to sum the values, if that's your goal.
When talking about filters, I am honestly not sure if it's a bug or if it is how filters are intended to work. I have done a bit of reading on how elastic handles arrays and it would make sense that I can't filter to only look at certain objects within the array, due to how elastic flattens arrays. It makes sense that it would only be filtering based on whether or not the filtered value exists in a certain JSON.
The suggestion to use Top Values seems to be working at the moment for me.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.