Aggregation with nested fields. Newbie problem


(Rick Tessner) #1

Hi,

I'm still quite new to elasticsearch and have found myself needing to use a nested index. I've tried to distill the problem into something simple that I could post here.

This comes from a database where each record has anywhere from 1 to a few hundred children. Each child has a unique name that comes from a pool of about 3000+ different names.

Mapping (from kibana console)

PUT test_nest
{
    "mappings": {
        "_doc": {
            "dynamic": "false",
            "properties": {
                "full_name": {
                    "type": "text"
                },
                "v": {
                    "type": "nested",
                    "properties": {
                        "name": {
                            "type": "keyword"
                        },
                        "value": {
                            "type": "float"
                        }
                    }
                }
            }
        }
    }
}

Sample data

 POST _bulk
{"index": {"_index": "test_nest", "_type": "_doc", "_id": "1"}}
{"full_name": "Three", "v": [{"name": "test_vol", "value": 3}, {"name": "duration", "value": 30}]}
{"index": {"_index": "test_nest", "_type": "_doc", "_id": "2"}}
{"full_name": "One", "v": [{"name": "test_vol", "value": 1}, {"name": "duration", "value": 10}]}
{"index": {"_index": "test_nest", "_type": "_doc", "_id": "3"}}
{"full_name": "Two", "v": [{"name": "test_vol", "value": 2}, {"name": "duration", "value": 20}]}

I'm trying to get an average value of all the v.name == test_vol. The query I use below does not work since it averages all the v.value entries.

I'm looking on how to limit the scope of the v.value entries to only those matching test_vol. Sorry if this is a newbie question but I'm at my wits end.

Aggregation Query

GET test_nest/_search
{
    "size": 0,
    "query": {
        "nested": {
            "path": "v",
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "v.name": "test_vol"
                            }
                        }
                        
                    ]
                }
            }
        }
    },
    "aggs": {
        "avg_test_vol": {
            "nested": {
                "path": "v"
            },
            "aggs": {
                "thing": {
                    "avg": {
                        "field": "v.value"
                    }
                }
            }
        }
    }
}

Results of query. I'd want to get an average of 2. Instead I get 11.

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 3,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "avg_test_vol": {
            "doc_count": 6,
            "thing": {
                "value": 11
            }
        }
    }
}

Thanks in advance for any hints on how to scope the documents used for the aggregation,

Rick


(Alexander Reelsen) #2

try

GET test_nest/_search
{
  "size": 0,
  "query": {
    "nested": {
      "path": "v",
      "query": {
        "match": {
          "v.name": "test_vol"
        }
      }
    }
  },
  "aggs": {
    "avg_test_vol": {
      "nested": {
        "path": "v"
      },
      "aggs": {
        "inner": {
          "filter": {
            "match": {
              "v.name": "test_vol"
            }
          }, 
          "aggs": {
            "thing": {
              "avg": {
                "field": "v.value"
              }
            }
          }
        }
      }
    }
  }
}

(Rick Tessner) #3

Perfect. Thank you very much Alexander!