Nested terms facet : return document count instead of terms count


(Damien Ferey) #1

I'm trying to do the following:

A player can play to many games.
Every game session has a begining date and a end date.

  • a player can have simultaneous game sessions
  • a player can have multiple sessions of a same game

(to simplify the samples I set only full years as dates)

I would like a facet/aggregation that filters only the active sessions at a given date.

ie:
Player1:

  • HearthStone: 2015-2016
  • Magic : 2005-2016

Player2:

  • HearthStone: 2014-2015
  • Magic : 2005-2008 & 2009-2010

So if I choose the date 2015, I should have the following result:

  • HearthStone : 2
  • Magic : 1

I can't manage to get what I want.

I provide my test in the following gist:

Can you give me a clue on how to rearrange my data and/or my facet/aggregation query ?


(Dan Tuffery) #2

Facets are deprecated, so use aggregations.

To get the results you want you need to use a nested filter aggregation (to filter out games that are not within the required date range) and for the counts of the name field use a nested terms aggregation. To get the counts of the nested documents I would recommend moving the name field into the same object as the beginDate and endDate fields:

{
  "player":{
	"properties":{
	  "games":{
		"type":"nested",
		"properties":{
		  "name":{
			"type":"string",
			"index":"not_analyzed"
		  },
		  "beginDate":{
			"type":"date",
			"format":"dd/MM/yyyy"
		  },
		  "endDate":{
			"type":"date",
			"format":"dd/MM/yyyy"
		  }
	      }
	  }
      }
   }
}

I noticed in your example you have used beginDate in your mapping but startDate in the documents.

Your docs will now be:

{
  "games":[
	{
	  "name":"HearthStone",
	  "beginDate":"01/01/2014",
	  "endDate":"01/01/2015"
	},
	{
	  "name":"Magic",
	  "beginDate":"01/01/2005",
	  "endDate":"01/01/2008"
	},
	{
	  "name":"Magic",
	  "beginDate":"01/01/2009",
	  "endDate":"01/01/2010"
	}
    ]
}

So the trade-off here is that the name field is duplicated in the data.

With this mapping and data you can return the counts of the nested docs that are within the date range :

{
  "aggs":{
	"user_games":{
	  "nested":{
		"path":"games"
	  },
	  "aggs":{
		"games_filter":{
		  "filter":{
			"bool":{
			  "must":[
				{
				  "range":{
					"games.beginDate":{
					  "lte":"01/01/2015"
					}
				  }
				},
				{
				  "range":{
					"games.endDate":{
					  "gte":"01/01/2015"
					}
				    }
				}
			    ]
			}
		  },
		  "aggs":{
			"game_name":{
			  "terms":{
				"field":"games.name"
			  }
		      }
		  }
	      }
	  }
       }
   }
}

(Damien Ferey) #3

Thank you a lot, I think I can figure now how to get the result I wanted to !


(system) #4