I'm trying to create a rule in ES using the metric threshold. If I used a single filter, it works fine. But when I use a group of filters strung together using "and", like filter1 and filter2 and filter3, then it the rule doesn't work. I've also tried without using "and", but I'm not able to save the rule after these changes.
Hi @vsabado Welcome to the community.
What version are you on?
Have you tried to construct the filter in discover and made sure it is filtering the correct documents.
Also it would be helpful if you share the entire alert and filter otherwise we would just be guessing
That's fair!
This is the query I'm trying to get to work: labels.http_route: "/pos/order/{orderId}/{version}/complete" and http.response.status_code: 200 and url.path: *
This is the one that works: labels.http_route: "/pos/order/{orderId}/{version}/complete"
I've tried on Discover and it works there. It's odd because both Discover and Rules accepts KQL.
Version:
8.5.2
So the complex first one works in discover?
And exactly the same in the metric threshold does not work?
And when you say it does not work, exactly what does that mean does it show an error? Does it not take it while typing it in? What is the behavior seeing?
What is the data type of the response code?
Yes, that is correct. Either of those queries will work in Discover just fine. I can even create an elasticsearch rule off of it. However when I do a metric threshold using the same KQL, then I hit problems.
It works in the sense that I actually get alerts sent to me via email with this query "labels.http_route: "/pos/order/{orderId}/{version}/complete"". Using the more complex one, I don't get a single email alert at all. I know the conditions are met
When I remove all of the "and" and attempt to save, there is no response. It just doesn't save. Having "and" in there does allow me to save but no alerts would fire off. The chart also says "no data".
I'm not sure what you mean by your last question.
When you test the rule does it show results?
If your filter is working and you say you should have results they should show in the graph
If you do not see a chart or results then you will get no alerts?
If this is a percent, are you putting in the right threshold?
Yes, there appears to be a correlation between that graph and if the alert fires off. With the complex query, it just shows no data. But I know that's false because discover shows me data.
Another thing to note though is that I'm using a group by. I actually want to modify my statement earlier. My query on discover is actually this: labels.retailer : "retailer" and labels.storeName : "store" and labels.http_route: "/pos/order/{orderId}/{version}/complete" and http.response.status_code: 200 and url.path: *
A few differences but somewhat close. I could have any combination of store and retailer, hence why I'm opting to use group by here. It works perfectly fine up until I start adding more filters
Did you modify the settings in the Infrastructure UI to match the index pattern for the DataView you’re using in Discover?
I'm sorry but I don't know what this means. Can you direct me to where I might find that setting?
Under “Observability” in the left hand nav, click on “Inventory” then click on “Settings” in the upper right hand corner of the screen. Change the “Metric indices” to match the index pattern of the DataView.
For Discover, I use APM. So I would just add that to this list, like so? metrics-,metricbeat-, APM
But it's still a little strange that data was showing up with just one filter. The issue only happens when I add more, and "and" them
After adding APM, the issue is still happening. Is there a problem with my KQL?
BTW What you are showing is not a metric threshold rule
That is a log threshold rule I believe (can you confirm)
If using a log threshold you need to add the index pattern under setting in Logs UI
What exactly do you want to alert on?
Can you just provide the pseudo code / logics... and which index you want to work on.
There are some non-obvious settings/approaches since it sounds like you want to build a custom rule,
Rules work on certain sets of indices / data views.
So tell use what you WANT to do in logic and index and perhaps we can help
It says metric threshold here so just going off of that.
We're gathering a number of metric data from our application, including the number of transactions completed. Creating a single rule using Elasticsearch query and the KQL I mentioned in this post does work but since we can have any combination of retailer and store, we'd hit a scaling issue. We couldn't possibly create a rule for each possible combination for thousands of store. So I'm leaning towards metric threshold with KQL and group by. It does fit my use case, and it does the job relatively well for any combinations of stores and retailers that I have. At least up until I start to use more than one filter for the KQL query.
Basically, if ANY of my store and retailer hits 100 document count (successful transactions, which I know through http endpoint tag) then send an alert telling me what the document count is as well as what the retailer and store combo.
"Rules work on certain sets of indices / data views."
This is unfortunately where you lose me. In Metric Threshold, there's no option to select a data view. In Elasticsearch query, I see that option and I do have it set to "APM".
I would love to just use Elasticsearch query, if it had a group by option but it doesn't.
I also want to take this moment to thank you guys for the assist. This problem is a bit of a blocker for my team right now
Hi @vsabado
1st Cool I did not see Document Count in metric threshold rule (cool I learned) ... so thanks but I would probably not use that rule type
I have some other work now... but I will get back to this.
I want to confirm which data view/indices you want to work on... sounds like apm transactions can you confirm? so apm-*
And on the use case
So for the actual rule to fire
First, apply the filter
then for the rule to fire
A) A combination of store + retailer is above 100 so you will be alerted on a Store + Retailer
Or
B) Any Store above 100 or Any Retailer above 100 (stores and / or retailers) as they happen
The more precise you are the better I can help ...
Yes that is correct. We can work off of APM. I know that works because that's what's on my Discover as well. If there's a better rule type we can use aside from Metric threshold, then I'm happy to hear more about that.
Option A is what I'd like. The rule must act on any combination of store AND retailer. This is meant to be used for troubleshooting purposes, so at the end of the day I need to know what store at what retailer to look at if we hit a certain threshold for any particular metrics we're interested in.
All of the data flowing up to Elasticsearch will have these two properties, storeName and retailer. Hence why we're using them here to filter through our dataset
Okay I'll digest this a bit.
It's always better when we get down to what you actually want to accomplish.
But I can tell you that because you actually want to do the combination of retailer and store may require a different approach to be alerted on it...
Because that field does not actually exist... And in terms of the elasticsearch, that's a two level aggregation.
Or we might need to create a runtime field that combines them
So let me think about that...
Perhaps @simianhacker May have a suggestion the whilst I am busy with my other work.
Sorry I keep mentioning this, but I just want to drive the point home that the metric threshold does exactly what I want it to do with one filter. I'm just not sure why the rule doesn't work when I start using more than one filter. Do we have an explanation for that?
We have some data transformation happening to our data before being sent off to Elasticsearch, so all the data we're acting on will have those two fields present. If that is what you meant
Hi @vsabado
I appreciate the hammering, its all good.
But I also know from solving many many of these topics that often when something "obvious" does not work, how to fix it is often non-obvious and so it requires some detailed questions and answers, some of which may or may not be linear.
So here are a couple of things I am considering.
From above
1st if you actually have a space there, it may not (probably not) be picking up that data view, There was a bug on that a while back that I think has been fixed now to not allow spaces in older releases (which you may be on) I think it just ignored the index pattern that can after the space so it is probably not picking up that data view
Newer version
2nd that entry should be an index pattern not a Data view.
3rd You probably need to create a new index pattern specifically for what you want
and 4th probably the reason why the multiple filters are not working is because the alert query is not actually running against the APM data you think it is (I am almost positive of that) it is running against one of the other index patterns and not the one you added and all those fields for the filter do not exist in the other patterns.
So lets try a few things and see if we can get you unblocked.
1st Go To Stack Management Data View and Create and new Data View like this.
This assumes you did not change the names of the APM Indices.
2nd Go to Infrastructure (Or Inventory) Under observability -> Settings and set the following
Only set this
Check Discover Make Sure it has the data you think it does
Mine is different but there are 2 filters... make sure you KQL works here
Then try your Metric Threshold Alert
Then if that works you can go back and add the metric index patterns to the Inventory / Infrastructure setting but no spaces
@stephenb Your response is exactly what I wanted to write but I was traveling... the filter needed the traces index in order to work properly.
@vsabado Side note: We are currently developing a new rule called the Custom Threshold rule in Observability that has all the same functionality of the Metric Threshold rule with some enhancements. One of the enhancements is the ability to pick a DataView so you don't have to go through the Infrastructure UI to change the data source. It's available in 8.10 as technical preview: