I have below query that buckets results and sorts them in the order I want, but I am unable to translate this to the Java API.
GET /my_data_set/_search
{
"size": 0,
"aggregations": {
"toBeOrdered": {
"terms": {
"field": "Content.keyword",
"size": 1000000,
"order": [
{
"topSort": "asc"
},
{
"nextSort": "asc"
}
]
},
"aggregations": {
"topAnswer": {
"top_hits": {
"size": 20,
"from": 0,
"sort": {
"DateTime": "asc"
},
"_source": { "includes": ["LastName", "Content", "DateTime"]}
}
},
"topSort": {
"max": {
"field": "LastName.raw"
}
},
"nextSort": {
"max": {
"field": "Content.raw"
}
}
}
}
}
}
An example document is
{
"Content": ...,
"LastName": ...,
"DateTime": ...
}
Edit 1:
The basic idea of the GET
request above is the following. Imagine looking for duplicate contents and taking the oldest one. After obtaining each oldest content value you want to sort by the last names of the authors.
Sample input documents.
{
"Content": "My first tweet",
"LastName": "Doe",
"DateTime": "2020-01-29'T'00:00:01"
},
{
"Content": "My first tweet",
"LastName": "Smith",
"DateTime": "2020-01-29'T'01:00:00"
},
{
"Content": "Some other tweet",
"LastName": "Locke",
"DateTime": "2020-01-30'T'00:00:01"
}
I would want to receive a result of
{
"Content": "My first tweet",
"LastName": "Doe",
"DateTime": "2020-01-29'T'00:00:01"
},
{
"Content": "Some other tweet",
"LastName": "Locke",
"DateTime": "2020-01-30'T'00:00:01"
}
The tweets by Doe and Smith were bucketed together and Doe's tweet was kept because they have the same "Content" value but Doe posted first. In the result we have Doe's tweet before Locke's tweet because the last name Doe comes before Locke alphabetically.
How do I write this in Java code?