Aggregation in Elasticsearch

I am working on Elasticsearch Aggregation and have a question regarding how to do pipeline sort of aggregation. I have three high-level fields in my ES document:

documentId, list1, list2

Example:
This is the couple of documents I have:

document 1:


{
  "documentId":"1",
  "list1": 
  [
    {
      "key": "key1",
      "value": "value11"
    }
  ],
  "list2": 
  [
    {
      "key": "key2",
      "value": "value21"
    }
...
  ]
}

document 2:

{
  "documentId":"2",
  "list1": 
  [
    {
      "key": "key1",
      "value": "value11"
    }
  ],
  "list2": 
  [
    {
      "key": "key2",
      "value": "value21"
    }
...
  ]
}

document 3:

{
  "documentId":"3",
  "list1": 
  [
    {
      "key": "key1",
      "value": "value12"
    }
  ],
  "list2": 
  [
    {
      "key": "key2",
      "value": "value21"
    }
...
  ]
}

To summarize -

document1 and document2 has same set of values for key1 and key2 (Except id is different, so they are treated two separate documents).

document3 has same value for key2 as in document1 and document2. Value for key1 is different from document1 and document2.

I want to run terms aggregator on keys of list1 field which should go as input into terms aggregation done on list2.

So, for the above example, the overall output I want is -
value21: 2
(one count corresponding to value11 in key1 and second count corresponding to value12 in key1)

and NOT
value21: 3 (two counts corresponding to value11 in key1 and third count corresponding to value12 in key1).

Is there any simple way of doing this?Preformatted text

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.