Cardinality aggregation with set up time range

Hello,
I have this query

GET /my_index3/_search
{
  "size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "num1.keyword",
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "count_of_suffix": {
          "cardinality": {
            "field": "suffix.keyword"
          }
        },
        "my_filter": {
          "bucket_selector": {
            "buckets_path": {
              "count_of_suffix": "count_of_suffix"
            },
            "script": "params.count_of_suffix == 2"
          }
        }
      }
    }
  }
}

With output

  "aggregations" : {
"num1" : {
  "doc_count_error_upper_bound" : 0,
  "sum_other_doc_count" : 0,
  "buckets" : [
    {
      "key" : "1563866656876839",
      "doc_count" : 106,
      "count_of_suffix" : {
        "value" : 2
      }
    },
    {
      "key" : "1563867854324841",
      "doc_count" : 50,
      "count_of_suffix" : {
        "value" : 2
      }
    },
    {
      "key" : "1563866656878888",
      "doc_count" : 42,
      "count_of_suffix" : {
        "value" : 2
      }
    },
    {
      "key" : "1563866656871111",
      "doc_count" : 40,
      "count_of_suffix" : {
        "value" : 2
      }

So it shows me numbers that have both suffix.

The thing what I need is somehow set up the range query for occured cardinality. I mean that num1 has only 1 suffix and if the same num1 didn't get second suffix within some time e.g one hour it wouldn't show this bucket even if the count_of_suffix == 2.

Thank you for any help!!!

That's an entity-centric question again.

So I cannot avoid it :sweat_smile: I'll study the topic. Thank you :))

What should help is to first understand what the problem is with doing this on an event-centric index - some queries just can't be made efficient on that type of store because they are fighting physics (network speeds, ram limitations).

In my case when I want to test entity-centric indexes approach on .txt file which is parsed by Logstash then forwarded to elasticsearch log-centric index (normal index? :smiley:) .

Do I understand it correctly that when I have data in some index I need to divide each log message from the index into entity-centric indexes where each log message has its own entity-centric index based on messageId ?