Hi folks.. I'm looking for some equivalent of SQL's 'group by' or 'distinct' ... basically I have this query
{
"query": {
"range": {
"post_date": {
"from": "2016-04-14 00:00:00",
"to": "2016-04-15 00:00:00"
}
}
},
"aggregations": {
"keywords": {
"significant_text": {
"field": "post_content",
"size": 50,
"background_filter": {
"range": {
"post_date": {
"from": "2016-01-01 00:00:00",
"to": "2016-04-13 00:00:00"
}
}
}
}
}
}
}
The problem is that if I have a lot of posts between April 14-April 15, 2016 in the same category it skews the results I'm looking for. For example if I entered 15 posts about "Jack" I don't actually don't want Jack to show up as "doc_count": 15
if all entries are in the same category.
The category is available as term_id
in the documents like this:
{
"_index": "indexname",
"_type": "_doc",
"_id": "6149376",
"_score": 1,
"_source": {
"post_id": 6149376,
"ID": 6149376,
"post_author": {
"raw": "admin",
"login": "admin",
"display_name": "admin",
"id": 1
},
"post_date": "2016-04-14 01:34:17",
"post_date_gmt": "2016-04-14 01:34:17",
"post_title": "title",
"post_excerpt": "",
"post_content_filtered": "content",
"post_status": "publish",
"post_name": "title",
"post_modified": "2016-04-14 01:34:17",
"post_modified_gmt": "2016-04-14 01:34:17",
"post_parent": 0,
"post_type": "post",
"post_mime_type": "",
"permalink": "https://example.com/title",
"terms": {
"category": [
{
"term_id": 1,
"slug": "uncategorized",
"name": "Uncategorized",
"parent": 0,
"term_taxonomy_id": 1,
"term_order": 0,
"facet": "{\"term_id\":1,\"slug\":\"uncategorized\",\"name\":\"Uncategorized\",\"parent\":0,\"term_taxonomy_id\":1,\"term_order\":0}"
}
],
}
}
}
Would appreciate pointers on how to modify the query to aggregate by the term_id
. If this can be done for the background_filter
as well, it may also be helpful.