Fetch top k frequent fields

Hi all,

I want to fetch the top k fields that are most frequent in the last 5 minutes of documents or in whole index. I have tried some queries, as shown below, to get the desired output, but it's taking a long time. I guess it's because of the painless script. Can someone help me with the query or suggest another API to fetch the most k frequent fields in the index?

GET /logs-*/_search
{
  "size": 0,
  "aggs": {
    "top_k_fields": {
      "terms": {
        "script": {
          "source": "return params._source.keySet()",
          "lang": "painless"
        },
        "size": 10
      }
    }
  }
}

Also, as a follow-up, I want to retrieve the values along with the top k frequent fields. For example, if field_A is the most frequent field and is available in all documents, I want to retrieve the data for that field as well. If this can be done with a single query, that would be great.

Thanks in advance.

For whole index there are the field statistics.

I’m curious why you want this info on last X minutes. Just curious.

Hi,

Thanks for the quick response. I have checked the _field_usage_stats API, but it won't give me the desired output. I want the top k fields based on which fields are available in most of the documents, not by usage of them. Suppose I have 10 documents, and field_A is present in 7 documents, field_B is present in 5 documents, and field_C is present in 3 documents. If I query the top 2 fields, it should return field_A and field_B.

I'm curious why you want this info on last X minutes. Just curious.

Means if I add a filter, is it possible to get the desired output because, in that case, the documents will be filtered out.

Sorry on 2 counts.

One, you are right, the fields data is keeping a sort of count, but not counting how many docs each field is present in.

Two, the answer quoted there makes no sense to me. But I was just curious.