Terms facet with multi-fields and count

Hi,

Is this a bug, or is my conception of how its supposed to be wrong?

If I have this query:

{
"from" : 0,
"size" : 200,
"query" : {
"filtered" : {
"query" : {
"match_all" : { }
}
}
},
"facets" : {
"specietypes" : {
"terms" : {
"fields" : [ "data_person_type", "data_person_gender",
"data_person_drink" ],
"size" : 3,
"order" : "count"
}
}
}
}

I would recon that the result-set should contain the 3 hits with the
largest count-value across the 3 fields? Using ES v 0.19.10, this test will
fail when count is 3 (embedded client)

assertEquals( 4L, (long) results.get( "male" ) );
assertEquals( 3L, (long) results.get( "human" ) );
assertEquals( 3L, (long) results.get( "beer" ) );

instead of "human" with 3 hits, "robot" with 2 hits will be returned in the
set of 3. When using a larger set, this test will pass, so the data set is
ok.

--
Runar Myklebust
Enonic AS
Senior Developer

http://enonic.com/download

--

You might be hitting this
issue https://github.com/elasticsearch/elasticsearch/issues/1832 As a
workaround, try increasing "size" in facets from 3 to a larger number and
see if it will improve the counts for the top 3 facets.

On Wednesday, November 14, 2012 2:13:55 AM UTC-5, Runar Myklebust wrote:

Hi,

Is this a bug, or is my conception of how its supposed to be wrong?

If I have this query:

{
"from" : 0,
"size" : 200,
"query" : {
"filtered" : {
"query" : {
"match_all" : { }
}
}
},
"facets" : {
"specietypes" : {
"terms" : {
"fields" : [ "data_person_type", "data_person_gender",
"data_person_drink" ],
"size" : 3,
"order" : "count"
}
}
}
}

I would recon that the result-set should contain the 3 hits with the
largest count-value across the 3 fields? Using ES v 0.19.10, this test will
fail when count is 3 (embedded client)

assertEquals( 4L, (long) results.get( "male" ) );
assertEquals( 3L, (long) results.get( "human" ) );
assertEquals( 3L, (long) results.get( "beer" ) );

instead of "human" with 3 hits, "robot" with 2 hits will be returned in
the set of 3. When using a larger set, this test will pass, so the data set
is ok.

--
Runar Myklebust
Enonic AS
Senior Developer

http://enonic.com/download

--

Indeed, when I change to 1 shard, it works like expected. This happened
during an integration test where only 6 documents where indexed, and with 4
shards I guess this would be very visible.

Thanks!

On Wed, Nov 14, 2012 at 3:09 PM, Igor Motov imotov@gmail.com wrote:

You might be hitting this issue
https://github.com/elasticsearch/elasticsearch/issues/1832 As a
workaround, try increasing "size" in facets from 3 to a larger number and
see if it will improve the counts for the top 3 facets.

On Wednesday, November 14, 2012 2:13:55 AM UTC-5, Runar Myklebust wrote:

Hi,

Is this a bug, or is my conception of how its supposed to be wrong?

If I have this query:

{
"from" : 0,
"size" : 200,
"query" : {
"filtered" : {
"query" : {
"match_all" : { }
}
}
},
"facets" : {
"specietypes" : {
"terms" : {
"fields" : [ "data_person_type", "data_person_gender",
"data_person_drink" ],
"size" : 3,
"order" : "count"
}
}
}
}

I would recon that the result-set should contain the 3 hits with the
largest count-value across the 3 fields? Using ES v 0.19.10, this test will
fail when count is 3 (embedded client)

assertEquals( 4L, (long) results.get( "male" ) );
assertEquals( 3L, (long) results.get( "human" ) );
assertEquals( 3L, (long) results.get( "beer" ) );

instead of "human" with 3 hits, "robot" with 2 hits will be returned in
the set of 3. When using a larger set, this test will pass, so the data set
is ok.

--
Runar Myklebust
Enonic AS
Senior Developer

http://enonic.com/download

--
Runar Myklebust
Enonic AS
Senior Developer

http://enonic.com/download

--