Write a query to aggregate by any fields in mapping


(android.kc) #1

Tried to build a query with 3 levels aggregation by ANY fields in the mappings, the fields can be nested type or non-nested type as the mapping below. For example, we can pick [array_A.a, array_B.b, A] or [B, array_B.a, array_B.b] to generate a 3 levels aggregation.

Also note that the order of the fields does matter, since aggregate by [array_A.a, array_B.b, A] and [A, array_A.a, array_B.b] would have different results.

To aggregate by a nested type, we include "nested": {"path": "root_B"}" in queries; for non-nested type, we may need to include "reverse_nested" for a root level field if the parent is nested type.

When building a query based on 3 fields selected randomly, we needed to check if the parent field is a nested type and has the same path. For example, two queries below returned different results. Q1 returned expected result but Q2 didn't return any data in the deepest buckets[] since it tred to access the field array_A.array_A.b, which not exists at all. Besides checking if the parent field is a nested type, also need to know if "reverse_nested" is needed. This requires quite a bit effort on implementation. It seems that the nested path is "relative path" between the parent & child. Can we use "absolute path" in "nested": {"path": "aPath"}} instead, to simplify the logic?

q1
curl -XPOST "http://localhost:9200/myindex/_search" -d '{
    "size":0,
    "aggs": {
        "aggName": {
            "terms": {
                "field": "A"
            },
            "aggs": {
                "nestedAggName": {
                    "nested": {
                        "path": "array_A"
                    },
                    "aggs": {
                        "appName": {
                            "terms": {
                                "field": "array_A.a"
                            },
                            "aggs": {
                                "appName": {
                                    "terms": {
                                        "field": "array_A.b"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}'

q2
curl -XPOST "http://localhost:9200/myindex/_search" -d '{
    "size":0,
    "aggs": {
        "aggName": {
            "terms": {
                "field": "A"
            },
            "aggs": {
                "nestedAggName": {
                    "nested": {
                        "path": "array_A"
                    },
                    "aggs": {
                        "appName": {
                            "terms": {
                                "field": "array_A.a"
                            },
                            "aggs": {
                                "nestedAggName": {
                                    "nested": {
                                        "path": "array_A"
                                    },
                                    "aggs": {
                                        "appName": {
                                            "terms": {
                                                "field": "array_A.b"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}'


"mappings": {
    "type": {
        "properties": {
            "array_A": {
                "include_in_parent": "true",
                "properties": {
                    "a": {
                        "type": "integer"
                    },
                    "b": {
                        "index": "not_analyzed",
                        "type": "string"
                    }
                },
                "type": "nested"
            },
            "array_B": {
                "properties": {
                    "a": {
                        "index": "not_analyzed",
                        "type": "string"
                    },
                    "b": {
                        "type": "integer"
                    }
                },
                "type": "nested"
            },
            "A": {
                "index": "not_analyzed",
                "type": "string"
            },
            "B": {
                "index": "not_analyzed",
                "type": "string"
            }      
        }
    }
}

(system) #2