How do I select field which is 1 to 1 relation with term aggregation field


#1

I am using Elasticsearch for aggregation, I want to select userId field and userName field and profit field, group by userId. Userid field and username field is 1 to 1 relationship.

The problem is that if I need to have userName in the result json file, I have to make userName as sub aggregation.

The partial code is as such:

"aggs": 
{
	"userid": 
	{
		"terms": 
		{
			"field": "userid",
			"size": 0
		},
		"aggs": 
		{
			"username": 
			{
				"terms": 
				{
					"field": "username",
					"size": 1
				}
			},

			"profit_sum": 
			{
				"sum": 
				{
					"script": "doc['in'].value - doc['out'].value",
					"lang": "expression"
				}
			}
		}
	}
}

Is there any way to select username as a field without making it a sub term aggregation?


(Mark Harwood) #2

If I understand correctly you want to replace the key used in your userid response buckets with the key found in the first (and only) child username bucket.

I don't believe there is a way to do that. Presumably the user names are not unique which is why you want to group at the top level by userid.

This is an age-old issue of computers wanting to work with unique IDs vs people wanting readable labels. I've come to the conclusion that for a lot of analytics use cases with Kibana etc it makes sense to combine IDs and labels into a single token to meet both needs. Something like [436334] John Smith. If you don't do this you always have this issue of how to attach readable labels to aggregation outputs.


#3

Sorry, I didn't express clearly. In my current aggregation project, I want to replace mysql database with elasticsearch. it requires me to replace the aggregation code in mysql with elasticsearch.

For example, mysql statement:
select userid, username, profits from mtable group by userid

I want to use elasticsearch to do the same thing. But the problem is that I cannot select the fields I want the response to include, in mysql I can use select as simple as "select userid , username". But in elasticsearch I need to aggregate first on userid, then use sub aggregation on username , so that I can include the username in the Json response.

it would be good I can choose the fields, I want aggregation clause to be like the following, is it possible?

"aggs": 
 {
 	"userid": 
 	{
 		"terms": 
 		{
 			"field": "userid",
 			"size": 0
 		},
                "fields" : ["userid", "username"],
 		"profit_sum": 
 		{
			"sum": 
 				{
 					"script": "doc['in'].value - doc['out'].value",
 					"lang": "expression"
 				}
 		}
	}
 }

(system) #4