Having problems with hit count from OR filter

I have the following aggregation which show all the count by values for a
particular field values.

http://localhost:8200/index1/collection1/_search?search_type=count
{
"aggs" : {
"effects" : {
"terms" : {
"field" : "type"
}
}
}
}

Output is
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 133490,
"max_score": 0,
"hits": []
},
"aggregations": {
"effects": {
"buckets": [
{
"key": "snp",
"doc_count": 112918
},
{
"key": "indel",
"doc_count": 15725
},
{
"key": "mnp",
"doc_count": 3751
},
{
"key": "mixed",
"doc_count": 1096
}
]
}
}
}

When i count the individual count, the total tallies to 133490 (which is
the total number of docs in the colleciton.

But when i do the following query, i don't get the exact result count ( I
am using all the possible values which returned above and converted to an
OR query ) :

{
"query": {
"filtered": {
"filter": {
"and": [

        {
          "query": {
            "filtered": {
              "filter": {
                "or": { "filters" : [
                  {
                    "query": {
                      "match": {
                        "type": "SNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "INS"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "DEL"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "COMPLEX"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MIXED"
                      }
                    }
                  }
                ] }
              }
            }
          }
        }
      ]
    }
  }
}

}

Output :
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 117765,
"max_score": 1,
"hits": [
.....]
}
}

As you can see the result hit count doesn't match the number of documents.
When i convert the above query from a match to "terms" based one, i get the
exact count.
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"filtered": {
"filter": {
"and" : [{
"query": {
"terms": {
"type": ["snp", "mixed", "indel", "mnp"]
}
}
}]
}
}
}
}
]
}
}
}
}

Is this an issue with the OR query ?

Also, is there a suitable alternative with the match query where i could
easily represent the above query like :
{
"query" : {
"match" : { "type" : [ "snp", "mixed", "indel", "mnp" ] }
}
}

Any help is appreciated.
Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b466f820-d5cc-4a3b-a77a-79fe5aaa8ada%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Lenin,

This looks like a bug indeed... Did you manage to nail down this issue?
Could you run the same terms aggregation on the "or" query to see the
distribution of terms?

On Fri, Oct 24, 2014 at 4:05 AM, Lenin lsubramanian@maverixbio.com wrote:

I have the following aggregation which show all the count by values for a
particular field values.

http://localhost:8200/index1/collection1/_search?search_type=count
{
"aggs" : {
"effects" : {
"terms" : {
"field" : "type"
}
}
}
}

Output is
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 133490,
"max_score": 0,
"hits":
},
"aggregations": {
"effects": {
"buckets": [
{
"key": "snp",
"doc_count": 112918
},
{
"key": "indel",
"doc_count": 15725
},
{
"key": "mnp",
"doc_count": 3751
},
{
"key": "mixed",
"doc_count": 1096
}
]
}
}
}

When i count the individual count, the total tallies to 133490 (which is
the total number of docs in the colleciton.

But when i do the following query, i don't get the exact result count ( I
am using all the possible values which returned above and converted to an
OR query ) :

{
"query": {
"filtered": {
"filter": {
"and": [

        {
          "query": {
            "filtered": {
              "filter": {
                "or": { "filters" : [
                  {
                    "query": {
                      "match": {
                        "type": "SNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "INS"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "DEL"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "COMPLEX"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MIXED"
                      }
                    }
                  }
                ] }
              }
            }
          }
        }
      ]
    }
  }
}

}

Output :
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 117765,
"max_score": 1,
"hits": [
.....]
}
}

As you can see the result hit count doesn't match the number of documents.
When i convert the above query from a match to "terms" based one, i get the
exact count.
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"filtered": {
"filter": {
"and" : [{
"query": {
"terms": {
"type": ["snp", "mixed", "indel", "mnp"]
}
}
}]
}
}
}
}
]
}
}
}
}

Is this an issue with the OR query ?

Also, is there a suitable alternative with the match query where i could
easily represent the above query like :
{
"query" : {
"match" : { "type" : [ "snp", "mixed", "indel", "mnp" ] }
}
}

Any help is appreciated.
Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b466f820-d5cc-4a3b-a77a-79fe5aaa8ada%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b466f820-d5cc-4a3b-a77a-79fe5aaa8ada%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5vvcwf0jw8WoOUXkrHBQF_xqG%2BR19YA4NriNGSrkv1sA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Adrien,

Thanks for getting back. I was able indeed fix the issue, it was a data
problem in my end.
But I ran into another issue with OR filter while i was able to figure the
above one.

I have posted it as github.

I have a testdata to simulate the same as well. Please let me know if you
need anything more.

Thanks.
-Lenin

On Monday, October 27, 2014 10:32:20 AM UTC-7, Adrien Grand wrote:

Hi Lenin,

This looks like a bug indeed... Did you manage to nail down this issue?
Could you run the same terms aggregation on the "or" query to see the
distribution of terms?

On Fri, Oct 24, 2014 at 4:05 AM, Lenin <lsubra...@maverixbio.com
<javascript:>> wrote:

I have the following aggregation which show all the count by values for a
particular field values.

http://localhost:8200/index1/collection1/_search?search_type=count
{
"aggs" : {
"effects" : {
"terms" : {
"field" : "type"
}
}
}
}

Output is
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 133490,
"max_score": 0,
"hits":
},
"aggregations": {
"effects": {
"buckets": [
{
"key": "snp",
"doc_count": 112918
},
{
"key": "indel",
"doc_count": 15725
},
{
"key": "mnp",
"doc_count": 3751
},
{
"key": "mixed",
"doc_count": 1096
}
]
}
}
}

When i count the individual count, the total tallies to 133490 (which is
the total number of docs in the colleciton.

But when i do the following query, i don't get the exact result count (
I am using all the possible values which returned above and converted to an
OR query ) :

{
"query": {
"filtered": {
"filter": {
"and": [

        {
          "query": {
            "filtered": {
              "filter": {
                "or": { "filters" : [
                  {
                    "query": {
                      "match": {
                        "type": "SNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "INS"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "DEL"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "COMPLEX"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MNP"
                      }
                    }
                  },
                  {
                    "query": {
                      "match": {
                        "type": "MIXED"
                      }
                    }
                  }
                ] }
              }
            }
          }
        }
      ]
    }
  }
}

}

Output :
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 117765,
"max_score": 1,
"hits": [
.....]
}
}

As you can see the result hit count doesn't match the number of
documents. When i convert the above query from a match to "terms" based
one, i get the exact count.
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query": {
"filtered": {
"filter": {
"and" : [{
"query": {
"terms": {
"type": ["snp", "mixed", "indel", "mnp"]
}
}
}]
}
}
}
}
]
}
}
}
}

Is this an issue with the OR query ?

Also, is there a suitable alternative with the match query where i could
easily represent the above query like :
{
"query" : {
"match" : { "type" : [ "snp", "mixed", "indel", "mnp" ] }
}
}

Any help is appreciated.
Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b466f820-d5cc-4a3b-a77a-79fe5aaa8ada%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b466f820-d5cc-4a3b-a77a-79fe5aaa8ada%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62ccd1be-d6d7-4d01-8de4-3d75f0c0880a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.