Aggregation Module - value_count problem


(watsindename) #1

Hi,

I am trying to get the unique number of values for a given field. From my
understanding of "value_counthttp://www.elasticsearch.org/guide/en/elasticsearch/reference/master/search-aggregations-metrics-valuecount-aggregation.html"
it counts the number of values that are extracted from the aggregated
documents. After the steps in
https://gist.github.com/shivprak/8611922#file-es-agg-test-1-sh, and on
querying for unique values of field "k" the response that I get is

{
"took": 16,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 20,
"max_score": 1,
"hits": [] //deleted content of "hits" to shorten the paste
},
"aggregations": {
"k_count": {
"value": 22
}
}
}

How can I be getting "value" 22, shouldn't it be 20 as thats the number of
unique "k" values in documents.

Another example, on querying for unique values of field "t" I get the
following response

{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 20,
"max_score": 1,
"hits": [] //deleted content of "hits" to shorten the paste
},
"aggregations": {
"k_count": {
"value": 20
}
}
}

Again, shouldn't the "value" be 1, as the only value of "t" is 23 in all
documents.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92ade2ec-6998-497d-a253-5decb7a733bc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jun Ohtani) #2

Hi,

I think “value_count” counts the number of values, as terms, per each docs.

Your first question: Why is “k_count” 22 instead of 20?

Your example field “k” is analyzed using the standard tokenizer.
2nd doc’s “k” is “a few” and 5th doc’s “k” is “next year”.
These text is divided by standard tokenizer.
Then “value_count” counts 2 value in 2nd doc and 5th doc.
Other doc return value 1.
Finally, the aggregator aggregate value from each docs, and then return 22.

Your second question: Why is “t_count” 20 instead of 1?

The aggregator counts value, as term , from each docs.
Because of that, the aggregator return 20.

Does it make sense?


Jun Ohtani
johtani@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani

2014/01/25 14:06、watsindename@gmail.com のメール:

Hi,

I am trying to get the unique number of values for a given field. From my understanding of "value_count" it counts the number of values that are extracted from the aggregated documents. After the steps in https://gist.github.com/shivprak/8611922#file-es-agg-test-1-sh, and on querying for unique values of field "k" the response that I get is

{
"took": 16,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 20,
"max_score": 1,
"hits": [] //deleted content of "hits" to shorten the paste
},
"aggregations": {
"k_count": {
"value": 22
}
}
}

How can I be getting "value" 22, shouldn't it be 20 as thats the number of unique "k" values in documents.

Another example, on querying for unique values of field "t" I get the following response

{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 20,
"max_score": 1,
"hits": [] //deleted content of "hits" to shorten the paste
},
"aggregations": {
"k_count": {
"value": 20
}
}
}

Again, shouldn't the "value" be 1, as the only value of "t" is 23 in all documents.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92ade2ec-6998-497d-a253-5decb7a733bc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3