A query to create flat table from different data

ronlik26 · August 30, 2017, 9:01am

Hi,

I'm trying to come up with a query that will help me take my data and aggregate it into one table.

I have many records such as the following:

{
"_index": "doc",
"_type": "info",
"_id": "4",
"_routing": "123",
"_parent": "123",
"_source": {
			"data:{
                      "name": "kuku",
			}
			"id": "123"
		    }
}

{
"_index": "doc",
"_type": "info",
"_id": "6",
"_routing": "678",
"_parent": "678",
"_source": {
			"data:{
                     "name": "kuku",
			}
			"id": "678"
		   }
}

{
"_index": "doc",
"_type": "info",
"_id": "8",
"_routing": "213",
"_parent": "213",
"_source": {
			"data:{
                     "name": "kuku2",
			}
			"id": "213"
		     }
}

and i want to go over all the info types and create the following:

Kuku: [123, 678]
Kuku2: [213]

I wanted to ask if its possible to create this with one query?
A query that will go over all the info types and each time take the name field and add the id field to a list by its name field

Thank you!

abdon · September 1, 2017, 8:45am

While it doesn't give you a flat table, a terms aggregation nested in another terms aggregation would give you what you want in a single request. Assuming the name and id fields are mapped as type keyword, or have fielddata enabled, the following request:

GET doc/info/_search
{
  "size": 0, 
  "aggs":{
    "names": {
      "terms": {
        "field": "data.name",
        "size": 100
      },
      "aggs": {
        "ids": {
          "terms": {
            "field": "id",
            "size": 100
          }
        }
      }
    }
  }
}

Returns:

  "aggregations": {
    "names": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "kuku",
          "doc_count": 2,
          "ids": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "123",
                "doc_count": 1
              },
              {
                "key": "678",
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key": "kuku2",
          "doc_count": 1,
          "ids": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "213",
                "doc_count": 1
              }
            ]
          }
        }
      ]
    }
  }

Some things to be aware of here: because of the "size": 100 you will only get the 100 most common names and IDs. You can change the size parameter if you need more or less values. Also, The values are ranked by popularity (how common they are - based on the doc_count). If you want, you can order the results differently.

system · September 29, 2017, 8:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multi Field Aggregation Elasticsearch	6	1028	July 6, 2017
Aggregation with nested fields. Newbie problem Elasticsearch	3	380	January 4, 2019
Logstash converting data from a flattened table from a SQL query into JSON inner object Logstash	5	781	September 19, 2019
Kibana-Create Data Table-Attributes with different _type Kibana	10	1908	July 6, 2017
How do I query elasticsearch in a way that returns entire rows instead of nested buckets? Elasticsearch	1	386	July 3, 2018

A query to create flat table from different data

Related topics