Aggregating array of objects

Im trying to figure out what im doing wrong, i have collected the following, "Sub set of data", "Current query", "Current output" & "Desired output" if more is required please let me know.

This is how my data objects look

[{
  "survey_answers": [
    {
      "id": "9ca01568e8dbb247", // As they are, this is the key to groupBy
      "option_answer": 5, // Represent the index of the choosen option
      "type": "OPINION_SCALE" // Opinion scales are 0-10 (meaning elleven options)
    },
    {
      "id": "ba37125ec32b2a99",
      "option_answer": 3,
      "type": "LABELED_QUESTIONS" // Labeled questions are 0-x (they can change it from survey to survey)
    }
  ],
  "survey_id": "test"
},
{
  "survey_answers": [
    {
      "id": "9ca01568e8dbb247",
      "option_answer": 0,
      "type": "OPINION_SCALE"
    },
    {
      "id": "ba37125ec32b2a99",
      "option_answer": 3,
      "type": "LABELED_QUESTIONS"
    }
  ],
  "survey_id": "test"
}]

This is how my current query/agg object looks

{
  "query": {
    "match": {
      "survey_id": "test"
    }
  },
  "aggs": {
    "agg_survey_answers": {
      "terms": {
        "field": "survey_answers.id.keyword"
      },
      "aggs": {
        "agg_option_answer": {
          "terms": {
            "field": "survey_answers.option_answer"
          }
        }
      }
    }
  }
}

Current output

{
  "agg_survey_answers": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [{
      "key": "9ca01568e8dbb247",
      "doc_count": 2,
      "agg_option_answer": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [{
          "key": 3,
          "doc_count": 2
        }, {
          "key": 0,
          "doc_count": 1
        }, {
          "key": 5,
          "doc_count": 1
        }]
      }
    }, {
      "key": "ba37125ec32b2a99",
      "doc_count": 2,
      "agg_option_answer": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [{
          "key": 3,
          "doc_count": 2
        }, {
          "key": 0,
          "doc_count": 1
        }, {
          "key": 5,
          "doc_count": 1
        }]
      }
    }]
  }
}

Desired output

{
  "agg_survey_answers": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [{
      "key": "9ca01568e8dbb247",
      "doc_count": 2,
      "agg_option_answer": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [{
          "key": 0,
          "doc_count": 1
        }, {
          "key": 5,
          "doc_count": 1
        }]
      }
    }, {
      "key": "ba37125ec32b2a99",
      "doc_count": 2,
      "agg_option_answer": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [{
          "key": 3,
          "doc_count": 2
        }]
      }
    }]
  }
}

Hello simon,

can you please elaborate ,what type of query to be generated with field names like,
Ex:
I need to groupby id to get survey_answers.... etc

so that it will be easy to form query and give you the solution.

So given this document structure

{
  "survey_answers": [
    {
      "id": "9ca01568e8dbb247",
      "option_answer": 0,
      "type": "OPINION_SCALE"
    },
    {
      "id": "ba37125ec32b2a99",
      "option_answer": 3,
      "type": "LABELED_QUESTIONS"
    }
  ],
  "survey_id": "test"
}

I would like to aggregate each object in the "survey_answers" array with the same array in other database entries

right now it counts across all the objects in "survey_answers" but it should only count like this:

entry1 -> survey_answers[0]
Should be aggregated with
entry2 -> survey_answers[0]

whereas it counts it all together right now so it doesn't care about the index in survey_answers it just count all sub objects together in one big mashup

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.