I want to fetch the top k fields that are most frequent in the last 5 minutes of documents or in whole index. I have tried some queries, as shown below, to get the desired output, but it's taking a long time. I guess it's because of the painless script. Can someone help me with the query or suggest another API to fetch the most k frequent fields in the index?
Also, as a follow-up, I want to retrieve the values along with the top k frequent fields. For example, if field_A is the most frequent field and is available in all documents, I want to retrieve the data for that field as well. If this can be done with a single query, that would be great.
Thanks for the quick response. I have checked the _field_usage_stats API, but it won't give me the desired output. I want the top k fields based on which fields are available in most of the documents, not by usage of them. Suppose I have 10 documents, and field_A is present in 7 documents, field_B is present in 5 documents, and field_C is present in 3 documents. If I query the top 2 fields, it should return field_A and field_B.
I'm curious why you want this info on last X minutes. Just curious.
Means if I add a filter, is it possible to get the desired output because, in that case, the documents will be filtered out.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.