OK, I think I was able to get what I needed but I am still not able to get
it 100% because of the lack of paging support for aggregations. I also
learned how powerful the aggregation is in Elasticsearch.
I changed the document structure a little (added the primaryId)
POST sport/football_team
{
"primaryId": "541afe09532aec0f305c5f2b",
"name": "Real Madrid",
"defense_strength": 88.2,
"middle_strength": 92.34,
"forward_strength": 97.45,
"player_ids": [
"1", "2", "3", "4", "21", "6", "7", "8", "9", "10", "11"
]
}
This is what I ended up with:
POST sport/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"range": {
"defense_strength": {
"lte": 83.43
}
}
},
{
"range": {
"forward_strength": {
"gte": 91
}
}
}
]
}
}
}
},
"aggs": {
"top_teams": {
"terms": {
"field": "primaryId"
},
"aggs": {
"top_team_hits": {
"top_hits": {
"sort": [
{
"forward_strength": {
"order": "desc"
}
}
],
"_source": {
"include": [
"name"
]
},
"from": 0,
"size" : 1
}
}
}
}
}
}
}
The result is what I expected:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits":
},
"aggregations": {
"top_teams": {
"buckets": [
{
"key": "541afdfc532aec0f305c2c48",
"doc_count": 2,
"top_team_hits": {
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "sport",
"_type": "football_team",
"_id": "y6jZ31xoQMCXaK23rPQgjA",
"_score": null,
"_source": {
"name": "Barcelona"
},
"sort": [
98.32
]
}
]
}
}
},
{
"key": "541afe08532aec0f305c5f28",
"doc_count": 2,
"top_team_hits": {
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "sport",
"_type": "football_team",
"_id": "hewWI0ZpTki4OgOeneLn1Q",
"_score": null,
"_source": {
"name": "Arsenal"
},
"sort": [
94.3
]
}
]
}
}
},
{
"key": "541afe09532aec0f305c5f2b",
"doc_count": 1,
"top_team_hits": {
"hits": {
"total": 1,
"max_score": null,
"hits": [
{
"_index": "sport",
"_type": "football_team",
"_id": "x-_YBX5jSba8qsEuB8guTQ",
"_score": null,
"_source": {
"name": "Real Madrid"
},
"sort": [
91.34
]
}
]
}
}
}
]
}
}
}
All good but now what I need is the ability to get first 2 aggregation
result and get the other 2 (in this case, only 1) in other request.
On Wednesday, September 24, 2014 1:18:27 PM UTC+3, Tugberk Ugurlu wrote:
In my sport index, I have the following documents indexed as football_team
type:
expected_result.js · GitHub
Here, each football team has a name and some strength values. Besides
that, there is a player_ids collection for each team. The team stregth has
been calculated by taking the avarage of players' strengths during the ETL
process. You can also see that there are multiple football teams with the
same name here but the player_ids collection is different.
When we run the following query:
expected_result.js · GitHub
We will get the following result:
expected_result.js · GitHub
Which is expected. However, what I would like to get here is top 1 row of
each group (grouped by the team name). The result I would like to get for
the above query is this:
expected_result.js · GitHub
Any idea?
Also, Here is the whole question in gist:
expected_result.js · GitHub
Tugberk
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/974e2f9f-053f-4add-a41c-6cb23148214e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.