Group by "field" OR group by missing "field"


(Vitaliy Teremasov) #1

I have a difficulties with elasticsearch.

Here is what I want to do:

Let's say unit of my index looks like this:

{
  transacId: "qwerty",
  amount: 150,
  userId: "adsf",
  client: "mobile",
  goal: "purchase"
}

I want to build different types of statistics of this data and elasticsearch does it really fast. The problem I have is that in my system user can add new field in transaction on demand. Let's say we have another row in the same index:

{
  transacId: "qrerty",
  amount: 200,
  userId: "adsf",
  client: "mobile",
  goal: "purchase",
  token_1: "game"
}

So now I want to group by token_1.

{
  query: {
    match: {userId: "asdf"}
  },
  aggs: {
    token_1: {
      terms: {field: "token_1"},
      aggs: {sumAmt: {sum: {field: "amount"}}} 
    }
  }
}

Problem here that it will aggregate only documents with field token_1. I know there is aggregation missing and I can do something like this:

{
  query: {
    match: {userId: "asdf"}
  },
  aggs: {
    token_1: {
      missing: {field: "token_1"},
      aggs: {sumAmt: {sum: {field: "amount"}}} 
    }
  }
}

But in this case it will aggregate only documents without field token_1, what I want is to aggregate both types of documents in on query. I tried do this, but it also didn't work for me:

{
  query: {
    match: {userId: "asdf"}
  },
  aggs: {
    token_1: {
      missing: {field: "token_1"},
      aggs: {sumAmt: {sum: {field: "amount"}}} 
    },
    aggs: {
      token_1: {
        missing: {field: "token_1"},
        aggs: {sumAmt: {sum: {field: "amount"}}} 
      }
    }
  }
}

I think may be there is something like operator OR in aggregation, but I couldn't find anything. Help me, please.


(Dan Tuffery) #2

You would need two aggregations missing_token_1 and token_1.

{
    "query": {
        "match": {
            "userId": "asdf"
        }
    },
    "aggs": {
        "missing_token_1": {
            "missing": {
                "field": "token_1"
            },
            "aggs": {
                "sumAmt": {
                    "sum": {
                        "field": "amount"
                    }
                }
            }
        },
        "token_1": {
            "terms": {
                "field": "token_1"
            },
            "aggs": {
                "sumAmt": {
                    "sum": {
                        "field": "amount"
                    }
                }
            }
        }
    }
}

(Vitaliy Teremasov) #3

Thank you. It works. I see my mistake was in adding "aggs" in the start of second aggregation. Anyway, thank you!


(system) #4