Hi ,
we have an alerting mechanism in place with the help of watcher where we have a complex filter query with an attachment from kibana saved objects ,our index contains multiple records with very similar field values with no unique value in those records because the same record gets indexed from multiple source our requirement is to pick the latest records and share with the end users in the form of spread sheet .Our current implementation using the saved objects attachment duplicates are not getting removed however when we apply the same search query with the help of aggregation the results are coming as expected . Could you please help me with any other solution where I can exclude all other similar/duplicate records by just displaying one single record out of all the similar ones .
Note : we don't have Logstash in place the data is coming from solace queue
Any help would be greatly appreciated
Thanks !
Welcome to our community!
We'd need to see a lot more detail to help here, examples of your queries and your data and the watches would be very helpful.
Thank you for responding
Data is coming from Solace queue and Id's are automatically generated by elastic and most of the fields are mapped as wildcard , date and few fields as text/keyword
PS : the query part of the watcher is working as expected(removing the similar records and displaying the latest record) but the results are getting fetched from the attachment which is coming from saved objects (we are not able to apply aggregations in the discover tab)
PUT _xpack/watcher/watch/log_error_watch
{
"trigger": {
"schedule": {
"interval": "10m"
}
},
"input": {
"search": {
"request": {
"indices": ["logs"],
"body": {
"query": {
"match": {
"message": "error"
//multiple match queries
}
},
"aggs":{
"dedup":{
"term":{
"field" :"example"},
"aggs":{
"latestdoc":{
"hits":{
"size":1,
"sort" :["@timestamp"]}}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"log": {
"logging": {
"text": "We got the expected error"
"attachment"://we have an attachment csv url here from saved objects tab
}
}
}
}
There is no term
aggregation type so I assume you mean terms
.
There is no hits
aggregation type so I assume you mean top_hits
.
If you want to find the latest doc using top_hits
it would need to sort in descending time order:
"aggs": {
"latestdoc": {
"top_hits": {
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
}
}
So, clearly, you already have a few mistakes in this that will prevent it from working as you expect.
Now, once you get the aggregations correct, you can then just use the payload of the resulting aggregations to just report out the information that you want. Below is an example using the built-in Kibana Sample Web Logs demo data set:
POST _watcher/watch/_execute
{
"watch": {
"trigger": {
"schedule": {
"interval": "10m"
}
},
"input": {
"search": {
"request": {
"indices": [
"kibana_sample_data_logs"
],
"body": {
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"response.keyword": "503"
}
}
]
}
},
"aggs": {
"dedup": {
"terms": {
"field": "geo.dest",
"size" :1000
},
"aggs": {
"latestdoc": {
"top_hits": {
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"log": {
"transform": {
"script": "return ctx.payload.aggregations.dedup.buckets.stream().map(p -> ['geo.dest':p.key,'time':p.latestdoc.hits.hits.0._source.timestamp]).collect(Collectors.toList());"
},
"logging": {
"text": """
We got the expected error for country codes
{{#ctx.payload._value}}
{{geo.dest}} at {{time}}
{{/ctx.payload._value}}
"""
}
}
}
}
}
The output looks like:
We got the expected error for country codes
CN at 2023-02-22T08:20:29.957Z
IN at 2023-02-22T15:39:31.289Z
US at 2023-02-23T10:08:39.040Z
ID at 2023-02-21T08:06:12.151Z
BR at 2023-02-22T13:06:45.025Z
...
... and so on
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.