How to list one records when there are 4 similar records with no unique field in ELK

Anuradha_Pendyala · January 2, 2023, 3:05pm

Hi ,
we have an alerting mechanism in place with the help of watcher where we have a complex filter query with an attachment from kibana saved objects ,our index contains multiple records with very similar field values with no unique value in those records because the same record gets indexed from multiple source our requirement is to pick the latest records and share with the end users in the form of spread sheet .Our current implementation using the saved objects attachment duplicates are not getting removed however when we apply the same search query with the help of aggregation the results are coming as expected . Could you please help me with any other solution where I can exclude all other similar/duplicate records by just displaying one single record out of all the similar ones .
Note : we don't have Logstash in place the data is coming from solace queue
Any help would be greatly appreciated
Thanks !

warkolm · January 2, 2023, 9:09pm

Welcome to our community!

We'd need to see a lot more detail to help here, examples of your queries and your data and the watches would be very helpful.

Anuradha_Pendyala · January 3, 2023, 3:39pm

Thank you for responding

Data is coming from Solace queue and Id's are automatically generated by elastic and most of the fields are mapped as wildcard , date and few fields as text/keyword

PS : the query part of the watcher is working as expected(removing the similar records and displaying the latest record) but the results are getting fetched from the attachment which is coming from saved objects (we are not able to apply aggregations in the discover tab)
PUT _xpack/watcher/watch/log_error_watch
{
"trigger": {
"schedule": {
"interval": "10m"
}
},
"input": {
"search": {
"request": {
"indices": ["logs"],
"body": {
"query": {
"match": {
"message": "error"
//multiple match queries
}
},
"aggs":{
"dedup":{
"term":{
"field" :"example"},
"aggs":{
"latestdoc":{
"hits":{
"size":1,
"sort" :["@timestamp"]}}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"log": {
"logging": {
"text": "We got the expected error"
"attachment"://we have an attachment csv url here from saved objects tab
}
}
}
}

richcollier · January 3, 2023, 8:45pm

There is no term aggregation type so I assume you mean terms.
There is no hits aggregation type so I assume you mean top_hits.

If you want to find the latest doc using top_hits it would need to sort in descending time order:

                  "aggs": {
                    "latestdoc": {
                      "top_hits": {
                        "size": 1,
                        "sort": [
                          {
                            "timestamp": {
                              "order": "desc"
                            }
                          }
                          ]
                      }
                    }
                  }

So, clearly, you already have a few mistakes in this that will prevent it from working as you expect.

Now, once you get the aggregations correct, you can then just use the payload of the resulting aggregations to just report out the information that you want. Below is an example using the built-in Kibana Sample Web Logs demo data set:

POST _watcher/watch/_execute
{
  "watch": {
    "trigger": {
      "schedule": {
        "interval": "10m"
      }
    },
    "input": {
      "search": {
        "request": {
          "indices": [
            "kibana_sample_data_logs"
            ],
            "body": {
              "size": 0,
              "query": {
                "bool": {
                  "filter": [
                    {
                      "term": {
                        "response.keyword": "503"
                      }
                    }
                    ]
                }
              },
              "aggs": {
                "dedup": {
                  "terms": {
                    "field": "geo.dest",
                    "size" :1000
                  },
                  "aggs": {
                    "latestdoc": {
                      "top_hits": {
                        "size": 1,
                        "sort": [
                          {
                            "timestamp": {
                              "order": "desc"
                            }
                          }
                          ]
                      }
                    }
                  }
                }
              }
            }
        }
      }
    },
    "condition": {
      "compare": {
        "ctx.payload.hits.total": {
          "gt": 0
        }
      }
    },
    "actions": {
      "log": {
        "transform": {
          "script": "return ctx.payload.aggregations.dedup.buckets.stream().map(p -> ['geo.dest':p.key,'time':p.latestdoc.hits.hits.0._source.timestamp]).collect(Collectors.toList());"
        },
        "logging": {
          "text": """
          We got the expected error for country codes
          {{#ctx.payload._value}}
          {{geo.dest}} at {{time}}
          {{/ctx.payload._value}}
          """
        }
      }
    }
  }
}

The output looks like:

          We got the expected error for country codes
          CN at 2023-02-22T08:20:29.957Z
          IN at 2023-02-22T15:39:31.289Z
          US at 2023-02-23T10:08:39.040Z
          ID at 2023-02-21T08:06:12.151Z
          BR at 2023-02-22T13:06:45.025Z
          ...

... and so on

system · January 31, 2023, 8:46pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Duplicates in ES Index Logstash	2	255	January 1, 2021
Watcher alert for matching multiple fields in same index Kibana elastic-stack-alerting	3	966	June 29, 2021
Unique Field in the watcher action Elasticsearch elastic-stack-alerting	2	1168	June 19, 2019
Duplicate records in Elasticsearch Logstash	3	409	March 5, 2021
Kibana watcher alert error Elasticsearch elastic-stack-alerting	2	617	August 29, 2019

How to list one records when there are 4 similar records with no unique field in ELK

Related topics