I've documents stored in Elastic Search in this way:
Doc1:
{
"owner": "owner_1",
"attributes" : [
{
"name": "dump",
"value": "xyz"
},
{
"name": "weight",
"value": "150"
}
]
}
Doc2:
{
"owner": "onwer_2",
"attributes" : [
{
"name": "dump",
"value": "xyz"
},
{
"name": "weight",
"value": "600"
}
]
}
Doc3:
{
"owner": "owner_3",
"attributes" : [
{
"name": "dump",
"value": "abc"
},
{
"name": "weight",
"value": "40"
}
]
}
Note: attributes
is a nested type field.
Problem statement:
- Retrieve documents that have the
name
dump in theirattributes
field. - Perform an aggregation based on the
value
of dump (abc, xyz). - Calculate the total weight for all documents that have the same dump value obtained from the previous step.
The result for the documents mentioned above resembles the following:
{
...
"aggregations": {
...
"buckets": [
{
"key": "xyz",
"doc_count": 2,
"weight": {
"value": 800
}
},
{
"key": "abc",
"doc_count": 1,
"weight": {
"value": 40
}
}
]
}
}
What steps I've taken so far?
I've prepared the following query.
{
"size": 0,
"query": {
"nested": {
"path": "doc.attributes",
"query": {
"bool": {
"must": [
{
"term": {
"doc.attributes.name.keyword": "dump"
}
}
]
}
}
}
},
"aggs": {
"attributes": {
"nested": {
"path": "doc.attributes"
},
"aggs": {
"NAME_BUCKET": {
"filter": {
"terms": {
"doc.attributes.name.keyword": [
"dump",
"weight"
]
}
},
"aggs": {
"VALUE_BUCKET": {
"terms": {
"field": "doc.attributes.value.keyword",
"size": 100
},
"aggs": {
"weight": {
"sum": {
"script": {
"lang": "painless",
"source": """
if(doc['doc.attributes.name.keyword'].value == "weight") {
return Double.parseDouble(doc['doc.attributes.value.keyword'].value);
}
return 0;
"""
}
}
}
}
}
}
}
}
}
}
}
The above query produces output similar to this:
{
...,
...,
"buckets": [
{
"key": "abc",
"doc_count": 1,
"Payment": {
"value": 0.0
}
},
{
"key": "xyz",
"doc_count": 2,
"Payment": {
"value": 0.0
}
},
{
"key": "150",
"doc_count": 1,
"Payment": {
"value": 150.0
}
},
{
"key": "40",
"doc_count": 1,
"Payment": {
"value": 40.0
}
},
{
"key": "650",
"doc_count": 1,
"Payment": {
"value": 650.0
}
}
]
}
Of course, the above query doesn't yield the expected output. The NAME_BUCKET aggregation aggregates records based on the dump and weight fields so that they can be used later in the Painless script. However, I am currently unable to determine how to aggregate weight based on the value of dump."
Your help would be highly appreciated.