How to keep on x no of recent documents based on field

We are trying to evaluate elasticsearch for keeping up our events and very impressed so far.
Though there are two issues we need help with

  • We need to only keep X no of events per MAC address and old events should be purged. Since MAC could be in order of thousands obviously having separate indexes per mac is out of question.
    I am assuming that this should be possible using aggregation and pipelines but now sure how
  • Is there a way we can get metric of No of query served during a time period, we need this info for our own benchmarking purposes.

Thanks in advance.

So currently I am using following approach

  • Iterate through all the macs

  • For each mac

      "from": 1000,
              "size": 1,
              "query": {
                  "bool": {
                      "must": [{"match_phrase": { "mac": mac  } }]
                  }
              },
              "sort": {
                  "eTime": {
                      "order": "desc"
                  }
             }
          },
    
  • Once I have that

      POST events-index/_delete_by_query?conflicts=proceed
      {
        "query": {
          "bool" : {
            "must": [
                {"match_phrase": {"mac":  mac}},
                {"range": {"eTime" :{"lte":ts}}}
            ]
            
          }
        }
      }
    

Now this is certainly not the efficient way, Is there a way to combine all three steps into one aggregation/pipeline.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.