Good day,
I am developing a solution whereby I will have an index with below documents
{
"name" : "Powlowski, Schaden and Kuvalis",
"financials" : [
{
"revenue_value_type" : "range",
"revenue_usd_average" : 11,
"revenue_usd" : null,
"fin_year" : 2023
},
{
"revenue_value_type" : "exact",
"revenue_usd_average" : null,
"revenue_usd" : 100,
"fin_year" : 2024
},
{
"revenue_value_type" : "exact",
"revenue_usd_average" : null,
"revenue_usd" : 89,
"fin_year" : 2022
}
],
"industries": ["Goods", "Healthcare"],
"markets": ["IT", "Technology"]
}
I want to be able to both sort on numerics for example revenue and filter on specific industries to get back the list of documents.
For example to order on the revenue of only the latest financial year I use something like below script
{
"size": 50,
"query": {
"match_all": {}
},
"sort": [
{
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": """
def latestYear = 0;
def latestFinancial = null;
// Find the financial object with the latest year
for (def financial : params['_source']['financials']) {
if (financial['fin_year'] > latestYear) {
latestYear = financial['fin_year'];
latestFinancial = financial;
}
}
if (latestFinancial == null) {
return 0; // Default score if no financials exist
}
// Calculate the score for the latest financial object
double revenue = 0;
if (latestFinancial['revenue_value_type'] == 'range') {
revenue = latestFinancial['revenue_usd_average'];
} else if (latestFinancial['revenue_value_type'] == 'exact') {
revenue = latestFinancial['revenue_usd'];
}
// Return only the revenue score
return revenue;
"""
},
"order": "desc"
}
}
]
}
I wanted to ask if I'm on the right track or are much better ways to accomplish this, such as pre-computed aggregations etc.
Many thanks