Aggregate & Count by two fields


(Safwan Erooth) #1

I am having the following data.e

  1. songs
  2. play_event

In songs the data is as below:

song_id  total_plays
1		 2000
2		 4532
3		 9999
4		 2343

And in play event the data is as below:

user_id	song_id
102			1
103			4
102			1
102			3
104			2
102			1

For each time a song was played, there is a new entry, even is a song is played again.

With this data I want to :

Get total no. of time each user played each songs. For example, if user_id 102 played, the song_id 1 three times, as per above data. I want to have it grouped by the user_id with total count. Something like below:

	user_id  song_id  count
	102		 1		  3
	102		 3		  1
	103		 4		  1
	104		 2		  1

Can this be done with ElasticSearch via a query instead of having to calculate and store it back to Elastic? I have the flexibility in indexing the data to elastic in any structure, if the structure I followed is wrong or not ideal.


(Christoph) #2

Hi @esafwan,

do you need the song_id counts for all users at the same time (e.g. for doing some data anlysis) or is it okay for your application to just get the counts for one particular user (e.g. for a users webpage dashboard or sth. similar?). In the first case I think you might be able to do a histogram aggregation with interval 1 on the user_id (if it is a numeric id value, your data suggests that) and nest another histogram aggregation (interval 1) on the song_id inside. Alternatively terms aggregation might work but those are more expensive and typically restricted to the top N hits.

This might lead to a large number of buckets though if you run it on the whole dataset, so if you are able to restrict it to one user, I'd add a filter on the user_id first and then only do a histogram (or terms aggregation) for the song_id of this one user.

Hope this helps as a starter for research, if you need more information it would help if you provide more info on your use case and your actual Elasticsearch mapping.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.