Terms Aggregation on Text field, fielddata true. Wrong doc_count


(anks) #1

Hi

I am trying to run terms aggregation on full text field with type as 'text' and fielddata enabled.

I am getting doc_count 1 per document match instead of frequency of term.

Mapping -

{"mappings":{"data":{"properties":{"content":{"type":"text","fielddata":true} } }}}

Index data -

/test/data/1
{"content":"concrete concrete"}

Query -
{
"query": {
"match": {
"_id": "1"
}
},
"aggs": {
"content": {
"terms": {
"field": "content"
}
}
}
}

Response -

{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "data",
"_id": "1",
"_score": 1,
"_source": {
"content": "concrete concrete"
}
}
]
},
"aggregations": {
"content": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "concrete",
"doc_count": 1
}
]
}
}
}

doc_count should be 2 ?

Thanks,
Ankita


(Fanfan) #2

In fact , the docs number match the query is one. Once you put data {"content":"concrete concrete guess"}, and terms aggs it, result is :
"buckets": [
{
"key": "concrete",
"doc_count": 1
}
,
{
"key": "guess",
"doc_count": 1
}
]
I think this indicates that the field was firstly analyzed to two terms "concrete" and "guess" , remove duplicate term´╝îand then group by the two terms.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.