Now there is a requirement as follows:
A piece of data has four fields: article_id, year, title, and version. If multiple articles share an article_id, the one with the highest version in the hit data will be returned during the search. At the same time, the returned aggregated data needs to be paginated. Display | Then the years in the displayed list need to be aggregated and grouped twice, and because there is a lot of data, paging and search functions are needed after aggregation and grouping. How should this be implemented? Now the first step of the code is already there:
First, aggregate article_id, then use top_hits to get the largest one, and use buckert_sort to implement paging capabilities. How should we improve the remaining secondary aggregation and paging and search of bucket year?
GET /ar_test2/_search
{
"size": 0,
"aggs": {
"group_by_article_id": {
"terms": {
"field": "article_id"
},
"aggs": {
"latest_documents": {
"top_hits": {
"_source": [ "*" ],
"sort": [ { "version": "desc" } ],
"size": 1
}
},
"group_by_year": {
"terms": {
"field": "year",
"size": 10,
"order": {
"_key": "desc"
}
}
},
"max_version": {
"max": {
"field": "version"
}
},
"bucket_sorter": {
"bucket_sort": {
"sort": [
{
"max_version": {
"order": "desc"
}
}
],
"from": 0,
"size": 5
}
}
}
}
}
}