Facet count unexpected result


(John) #1

To illustrate the problem, I create 16 records. 14 are in Phoenix and
2 are in Scottsdale:

curl -XPUT 'http://localhost:9200/subscribers/subscriber/1' -d
' { "name": "John", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/2' -d
' { "name": "Joe", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/3' -d
' { "name": "Jack", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/4' -d
' { "name": "Janet", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/5' -d
' { "name": "Jane", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/6' -d
' { "name": "Bill", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/7' -d
' { "name": "Steve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/8' -d
' { "name": "Lucy", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/9' -d
' { "name": "Zane", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/10' -d
' { "name": "George", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/11' -d
' { "name": "Adam", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/12' -d
' { "name": "Eve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/13' -d
' { "name": "William", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/14' -d
' { "name": "Alan", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/15' -d
' { "name": "Luke", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/16' -d
' { "name": "Bo", "location": "Phoenix" }'

I want to find the 'most popular' locations, I use:

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 2
}
}
}
}'

This works as expected:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 0,
"terms" : [ {
"term" : "phoenix",
"count" : 14
}, {
"term" : "scottsdale",
"count" : 2
} ]
}
}

Now I want to find the single most popular location (I changed the
size field from 2 to 1):

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 1
}
}
}
}'

Result:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 3,
"terms" : [ {
"term" : "phoenix",
"count" : 13
} ]
}
}

This is not what I expect. I still expect: other '2' and count '14'.
Am I missing something?
My version is 0.17.6


(Karussell) #2

Do you have more than one shard? Have a look into this discussion:

Peter

On 28 Okt., 02:53, John john.bo...@gmail.com wrote:

To illustrate the problem, I create 16 records. 14 are in Phoenix and
2 are in Scottsdale:

curl -XPUT 'http://localhost:9200/subscribers/subscriber/1'-d
' { "name": "John", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/2'-d
' { "name": "Joe", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/3'-d
' { "name": "Jack", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/4'-d
' { "name": "Janet", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/5'-d
' { "name": "Jane", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/6'-d
' { "name": "Bill", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/7'-d
' { "name": "Steve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/8'-d
' { "name": "Lucy", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/9'-d
' { "name": "Zane", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/10'-d
' { "name": "George", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/11'-d
' { "name": "Adam", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/12'-d
' { "name": "Eve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/13'-d
' { "name": "William", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/14'-d
' { "name": "Alan", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/15'-d
' { "name": "Luke", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/16'-d
' { "name": "Bo", "location": "Phoenix" }'

I want to find the 'most popular' locations, I use:

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 2
}
}
}

}'

This works as expected:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 0,
"terms" : [ {
"term" : "phoenix",
"count" : 14
}, {
"term" : "scottsdale",
"count" : 2
} ]
}
}

Now I want to find the single most popular location (I changed the
size field from 2 to 1):

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 1
}
}
}

}'

Result:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 3,
"terms" : [ {
"term" : "phoenix",
"count" : 13
} ]
}
}

This is not what I expect. I still expect: other '2' and count '14'.
Am I missing something?
My version is 0.17.6


(John) #3

Yes, this is exactly the same issue.
Thanks for pointing me to that.

On Oct 28, 12:36 am, Karussell tableyourt...@googlemail.com wrote:

Do you have more than one shard? Have a look into this discussion:

https://github.com/elasticsearch/elasticsearch/issues/1305

Peter

On 28 Okt., 02:53, John john.bo...@gmail.com wrote:

To illustrate the problem, I create 16 records. 14 are in Phoenix and
2 are in Scottsdale:

curl -XPUT 'http://localhost:9200/subscribers/subscriber/1'-d
' { "name": "John", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/2'-d
' { "name": "Joe", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/3'-d
' { "name": "Jack", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/4'-d
' { "name": "Janet", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/5'-d
' { "name": "Jane", "location": "Scottsdale" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/6'-d
' { "name": "Bill", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/7'-d
' { "name": "Steve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/8'-d
' { "name": "Lucy", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/9'-d
' { "name": "Zane", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/10'-d
' { "name": "George", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/11'-d
' { "name": "Adam", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/12'-d
' { "name": "Eve", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/13'-d
' { "name": "William", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/14'-d
' { "name": "Alan", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/15'-d
' { "name": "Luke", "location": "Phoenix" }'
curl -XPUT 'http://localhost:9200/subscribers/subscriber/16'-d
' { "name": "Bo", "location": "Phoenix" }'

I want to find the 'most popular' locations, I use:

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 2
}
}
}

}'

This works as expected:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 0,
"terms" : [ {
"term" : "phoenix",
"count" : 14
}, {
"term" : "scottsdale",
"count" : 2
} ]
}
}

Now I want to find the single most popular location (I changed the
size field from 2 to 1):

curl -XGET 'http://localhost:9200/subscribers/subscriber/_search?
pretty=true' -d '
{ "size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"locations" : {
"terms" : {
"field" : "location",
"size" : 1
}
}
}

}'

Result:

"facets" : {
"locations" : {
"_type" : "terms",
"missing" : 0,
"total" : 16,
"other" : 3,
"terms" : [ {
"term" : "phoenix",
"count" : 13
} ]
}
}

This is not what I expect. I still expect: other '2' and count '14'.
Am I missing something?
My version is 0.17.6


(system) #4