Terms_stats facet

Hi,

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT 'http://localhost:9200/test1'
curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"obj1" : [{"name" :
"blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST 'http://localhost:9200/test1/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "obj1.name","value_field": "obj1.count"}}}}'
This returns:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test1",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
]
}
}
}

I have a question about this
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
Why total_count, min, max, total and mean are calculated for whole document
obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST 'http://localhost:9200/test/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":1,
"min":6.0,
"max":6.0,
"total":6.0,
"mean":6.0
},
{
"term":"blue",
"count":1,
"total_count":1,
"min":4.0,
"max":4.0,
"total":4.0,
"mean":4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or a
feature ? :slight_smile:

Thanks.
Best regards.
Marcin.

First, gist samples, its hard to read it in the mail. The terms stats facet
won't work properly when both the key and the value are multi valued within
the document. You can use nested mapping and nested scope facet to treat
each object in the array as its own "document".

On Thu, Apr 12, 2012 at 2:23 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT 'http://localhost:9200/test1'
curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"obj1" : [{"name" :
"blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST 'http://localhost:9200/test1/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "obj1.name","value_field": "obj1.count"}}}}'
This returns:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test1",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
]
}
}
}

I have a question about this
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
Why total_count, min, max, total and mean are calculated for whole
document obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST 'http://localhost:9200/test/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":1,
"min":6.0,
"max":6.0,
"total":6.0,
"mean":6.0
},
{
"term":"blue",
"count":1,
"total_count":1,
"min":4.0,
"max":4.0,
"total":4.0,
"mean":4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or a
feature ? :slight_smile:

Thanks.
Best regards.
Marcin.

Hi,

Sorry for but I do not know what is "gist sample", could you explain ? :slight_smile:
No I get how the terms facets work with multi valued fields and this seems
to be even advantage to me.

I have one question about nested fields: is it possible to access document
fields from nested scope facet ? For example (example document):
{
"name":"sample name",
"tags":[
{
"name":"tag1",
"value":"value1"
},
{
"name":"tag2",
"value":"value2"
}
]
}
Lets asume that 'tags' is a nested field. Is it possible to access 'name'
field from nested scope facet for 'tags' ? I mean if it is possible to
construct such a facet for 'tags' that includes "term" filter checking
field 'name' to count documents with given value for 'name' only? As far as
I know it is not possible but I want to be sure I don't miss something.

Thanks.
Best regards.

2012/4/13 Shay Banon kimchy@gmail.com

First, gist samples, its hard to read it in the mail. The terms stats
facet won't work properly when both the key and the value are multi valued
within the document. You can use nested mapping and nested scope facet to
treat each object in the array as its own "document".

On Thu, Apr 12, 2012 at 2:23 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT 'http://localhost:9200/test1'
curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"obj1" : [{"name"
: "blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST 'http://localhost:9200/test1/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "obj1.name","value_field": "obj1.count"}}}}'
This returns:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test1",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
]
}
}
}

I have a question about this
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
Why total_count, min, max, total and mean are calculated for whole
document obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST 'http://localhost:9200/test/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":1,
"min":6.0,
"max":6.0,
"total":6.0,
"mean":6.0
},
{
"term":"blue",
"count":1,
"total_count":1,
"min":4.0,
"max":4.0,
"total":4.0,
"mean":4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or a
feature ? :slight_smile:

Thanks.
Best regards.
Marcin.

Everything about gist is here : Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/help/

David.

Le 14 avril 2012 à 12:55, Marcin Dojwa m.dojwa@livechatinc.com a écrit :

Sorry for but I do not know what is "gist sample", could you explain ? :slight_smile:
No I get how the terms facets work with multi valued fields and this seems
to be even advantage to me.

I have one question about nested fields: is it possible to access document
fields from nested scope facet ? For example (example document):
{
"name":"sample name",
"tags":[
{
"name":"tag1",
"value":"value1"
},
{
"name":"tag2",
"value":"value2"
}
]
}
Lets asume that 'tags' is a nested field. Is it possible to access 'name'
field from nested scope facet for 'tags' ? I mean if it is possible to
construct such a facet for 'tags' that includes "term" filter checking
field 'name' to count documents with given value for 'name' only? As far as
I know it is not possible but I want to be sure I don't miss something.

Thanks.
Best regards.

2012/4/13 Shay Banon < kimchy@gmail.com mailto:kimchy@gmail.com >

First, gist samples, its hard to read it in the mail. The terms stats facet
won't work properly when both the key and the value are multi valued within
the document. You can use nested mapping and nested scope facet to treat
each object in the array as its own "document".

On Thu, Apr 12, 2012 at 2:23 PM, Marcin Dojwa < m.dojwa@livechatinc.com
mailto:m.dojwa@livechatinc.com > wrote:

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT ' http://localhost:9200/test1 http://localhost:9200/test1 '
curl -XPUT ' http://localhost:9200/test1/type1/1
http://localhost:9200/test1/type1/1 ' -d '{"obj1" : [{"name" :
"blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST ' http://localhost:9200/test1/type1/_search?pretty=1
http://localhost:9200/test1/type1/_search?pretty=1 ' -d '{"query":
{"match_all": {}},"facets": {"facet1": {"terms_stats": {"key_field" : "
obj1.name http://obj1.name ","value_field": "obj1.count"}}}}'
This returns:
{
"took" : 1 ,
"timed_out" : false ,
"_shards" : {
"total" : 5 ,
"successful" : 5 ,
"failed" : 0
} ,
"hits" : {
"total" : 1 ,
"max_score" : 1.0 ,
"hits" : [
{
"_index" : "test1" ,
"_type" : "type1" ,
"_id" : "1" ,
"_score" : 1.0 ,
"_source" : {
"obj1" : [
{
"name" : "blue" ,
"count" : 4
} ,
{
"name" : "green" ,
"count" : 6
}
]
}
}
]
} ,
"facets" : {
"facet1" : {
"_type" : "terms_stats" ,
"missing" : 0 ,
"terms" : [
{
"term" : "green" ,
"count" : 1 ,
"total_count" : 2 ,
"min" : 4.0 ,
"max" : 6.0 ,
"total" : 10.0 ,
"mean" : 5.0
} ,
{
"term" : "blue" ,
"count" : 1 ,
"total_count" : 2 ,
"min" : 4.0 ,
"max" : 6.0 ,
"total" : 10.0 ,
"mean" : 5.0
}
]
}
}
}

I have a question about this
{
"term" : "green" ,
"count" : 1 ,
"total_count" : 2 ,
"min" : 4.0 ,
"max" : 6.0 ,
"total" : 10.0 ,
"mean" : 5.0
} ,
{
"term" : "blue" ,
"count" : 1 ,
"total_count" : 2 ,
"min" : 4.0 ,
"max" : 6.0 ,
"total" : 10.0 ,
"mean" : 5.0
}
Why total_count, min, max, total and mean are calculated for whole
document
obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST ' http://localhost:9200/test/type1/_search?pretty=1
http://localhost:9200/test/type1/_search?pretty=1 ' -d '{"query":
{"match_all": {}},"facets": {"facet1": {"terms_stats": {"key_field" :
"name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took" : 0 ,
"timed_out" : false ,
"_shards" : {
"total" : 5 ,
"successful" : 5 ,
"failed" : 0
} ,
"hits" : {
"total" : 1 ,
"max_score" : 1.0 ,
"hits" : [
{
"_index" : "test" ,
"_type" : "type1" ,
"_id" : "1" ,
"_score" : 1.0 ,
"_source" : {
"obj1" : [
{
"name" : "blue" ,
"count" : 4
} ,
{
"name" : "green" ,
"count" : 6
}
]
}
}
]
} ,
"facets" : {
"facet1" : {
"_type" : "terms_stats" ,
"missing" : 0 ,
"terms" : [
{
"term" : "green" ,
"count" : 1 ,
"total_count" : 1 ,
"min" : 6.0 ,
"max" : 6.0 ,
"total" : 6.0 ,
"mean" : 6.0
} ,
{
"term" : "blue" ,
"count" : 1 ,
"total_count" : 1 ,
"min" : 4.0 ,
"max" : 4.0 ,
"total" : 4.0 ,
"mean" : 4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or a
feature ? :slight_smile:

Thanks.
Best regards.
Marcin.

--
David Pilato
http://dev.david.pilato.fr/
Twitter : @dadoonet

When you say access name, what do you mean? As a script, or being able to
compose a query that checks on name and some nested values? If its the
latter, then yes, you can do it, a query against nested fields just need to
be wrapped in a nested query.

On Sat, Apr 14, 2012 at 1:55 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

Sorry for but I do not know what is "gist sample", could you explain ? :slight_smile:
No I get how the terms facets work with multi valued fields and this seems
to be even advantage to me.

I have one question about nested fields: is it possible to access document
fields from nested scope facet ? For example (example document):
{
"name":"sample name",
"tags":[
{
"name":"tag1",
"value":"value1"
},
{
"name":"tag2",
"value":"value2"
}
]
}
Lets asume that 'tags' is a nested field. Is it possible to access 'name'
field from nested scope facet for 'tags' ? I mean if it is possible to
construct such a facet for 'tags' that includes "term" filter checking
field 'name' to count documents with given value for 'name' only? As far as
I know it is not possible but I want to be sure I don't miss something.

Thanks.
Best regards.

2012/4/13 Shay Banon kimchy@gmail.com

First, gist samples, its hard to read it in the mail. The terms stats
facet won't work properly when both the key and the value are multi valued
within the document. You can use nested mapping and nested scope facet to
treat each object in the array as its own "document".

On Thu, Apr 12, 2012 at 2:23 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT 'http://localhost:9200/test1'
curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"obj1" : [{"name"
: "blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST 'http://localhost:9200/test1/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "obj1.name","value_field": "obj1.count"}}}}'
This returns:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test1",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
]
}
}
}

I have a question about this
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
Why total_count, min, max, total and mean are calculated for whole
document obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST 'http://localhost:9200/test/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":1,
"min":6.0,
"max":6.0,
"total":6.0,
"mean":6.0
},
{
"term":"blue",
"count":1,
"total_count":1,
"min":4.0,
"max":4.0,
"total":4.0,
"mean":4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or
a feature ? :slight_smile:

Thanks.
Best regards.
Marcin.

Hi,

I mean the latter one. Thank you for the answer Shay.

Best regards.
Marcin

2012/4/17 Shay Banon kimchy@gmail.com

When you say access name, what do you mean? As a script, or being able to
compose a query that checks on name and some nested values? If its the
latter, then yes, you can do it, a query against nested fields just need to
be wrapped in a nested query.

On Sat, Apr 14, 2012 at 1:55 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

Sorry for but I do not know what is "gist sample", could you explain ? :slight_smile:
No I get how the terms facets work with multi valued fields and this
seems to be even advantage to me.

I have one question about nested fields: is it possible to access
document fields from nested scope facet ? For example (example document):
{
"name":"sample name",
"tags":[
{
"name":"tag1",
"value":"value1"
},
{
"name":"tag2",
"value":"value2"
}
]
}
Lets asume that 'tags' is a nested field. Is it possible to access 'name'
field from nested scope facet for 'tags' ? I mean if it is possible to
construct such a facet for 'tags' that includes "term" filter checking
field 'name' to count documents with given value for 'name' only? As far as
I know it is not possible but I want to be sure I don't miss something.

Thanks.
Best regards.

2012/4/13 Shay Banon kimchy@gmail.com

First, gist samples, its hard to read it in the mail. The terms stats
facet won't work properly when both the key and the value are multi valued
within the document. You can use nested mapping and nested scope facet to
treat each object in the array as its own "document".

On Thu, Apr 12, 2012 at 2:23 PM, Marcin Dojwa m.dojwa@livechatinc.comwrote:

Hi,

I am not sure if this is a bug or it just works like this. Lets take an
example:
curl -XPUT 'http://localhost:9200/test1'
curl -XPUT 'http://localhost:9200/test1/type1/1' -d '{"obj1" :
[{"name" : "blue","count" : 4},{"name" : "green","count" : 6}]}'
curl -XPOST 'http://localhost:9200/test1/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "obj1.name","value_field": "obj1.count"}}}}'
This returns:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test1",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
]
}
}
}

I have a question about this
{
"term":"green",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
},
{
"term":"blue",
"count":1,
"total_count":2,
"min":4.0,
"max":6.0,
"total":10.0,
"mean":5.0
}
Why total_count, min, max, total and mean are calculated for whole
document obj1 and not just for specified terms (green, blue) separately ?

When obj1 has nested type and the request is:
curl -XPOST 'http://localhost:9200/test/type1/_search?pretty=1' -d
'{"query": {"match_all": {}},"facets": {"facet1": {"terms_stats":
{"key_field" : "name","value_field": "count"}},"nested":"obj1"}}'
the answer is OK:
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"test",
"_type":"type1",
"_id":"1",
"_score":1.0,
"_source":{
"obj1":[
{
"name":"blue",
"count":4
},
{
"name":"green",
"count":6
}
]
}
}
]
},
"facets":{
"facet1":{
"_type":"terms_stats",
"missing":0,
"terms":[
{
"term":"green",
"count":1,
"total_count":1,
"min":6.0,
"max":6.0,
"total":6.0,
"mean":6.0
},
{
"term":"blue",
"count":1,
"total_count":1,
"min":4.0,
"max":4.0,
"total":4.0,
"mean":4.0
}
]
}
}
}

But the question is why in the first case is like above. Is it a bug or
a feature ? :slight_smile:

Thanks.
Best regards.
Marcin.