Group by latest Plus Aggregation for Analysis or Delete old records

Hello everyone,

I am trying to get statistics from my elastic on some filters.
I managed to get some using this:

GET <index>/_search
{
  "query": {
    "bool": {"must":[ 
      {"terms":{"type.keyword":[ "video", "text" ]}}, 
      {"terms":{"category.keyword": ["Mathematics",  "Chemistry", "Biology", "Physics" ]}} 
     ]}
  },
  "size": 20,
  "aggs": { 
      "source_language":{"terms":{"field": "source_language.keyword"}},
      "translation_language":{"terms":{"field": "translation_language.keyword"}}
    }
}

Where on certain types and categories it will return number of records for with each source_language and translation _language.

I want to add another filter for this statistics.
I have 2 identifiers for each doc (source_id, translation_id) and date field.
I want to group by on both identifiers and get the latest doc using the date field.

I tired top_hits, but it doesn't have sub aggregation.

Also, I am working on Elasticsearch 6.5 (mandatory)

I need a way to make multi_terms from 7.15 and a way to get that latest documents, then another way to make my aggregation for my required analysis like (translation_language)

Basically, I wand the same analysis on latest versions only.

any suggestion where to start ?

And if this isn't possible, another approach came to mine is to delete older versions if new version uploaded (new version will be indicated using source_id, translation_id, and date)

The data is streamed using logstash from postgres sql using jdbc-plugin

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.