Possible Bug pertaining to query engine

Hi,
Probably this is a bug, but the following simple boolean query returns no results. Yet if I remove one or the other of the two expressions, it does return results and quite clearly (I can see anyway) that there are documents that satisfy both expressions. So then why do they return nothing when combined?

{
"from": 0,
"size": 100,
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"appId": {
"query": "15"
}
}
}, {
"match_phrase": {
"claims": {
"query": "polar code"
}
}
}
]
}
}
}

It would be easier to help if could show a document you expect to match but doesn’t.

I'll try to post some fragments but it should be just as easy to create a few test documents with 2 fields to test the query. 10 minutes max.

True. As it takes only 10 minutes, you should provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

1 Like

Thanks. I'll work on that. In meantime my post serves as a heads up for the intrepid reader to investigate should it prove true.

Here are two screenshots from kibana showing the problem. I wrote a test script to re-create it, but haven't found a combination of test documents to reproduce it yet, but it certainly exists in our index.boolean-bug4 boolean-bug3 boolean-bug2

curl -XDELETE http://localhost:9200/bugtest
curl -XPUT http://localhost:9200/bugtest
curl -XPUT http://localhost:9200/bugtest/doc/1 -d '{ "appId":"15123456", "claims":"The dog jumped over the cat." }'
curl -XPUT http://localhost:9200/bugtest/doc/2 -d '{ "appId":"15128490", "claims":"The fox jumped over the rabbit." }'
sleep 2
curl -XPOST http://localhost:9200/bugtest/_search?pretty -d '{
"from": 0,
"size": 100,
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"appId": {
"query": "15"
}
}
}, {
"match_phrase": {
"claims": {
"query": "fox jumped"
}
}
}
]
}
}
}'

If we can't reproduce then that's probably another problem then what you are describing.

I can reproduce it just fine with my data, but I can't share that. However, I can say 100% that the behavior is incorrect given the data and query I posted (screenshots). I will try to reproduce in a script, but beyond that the elastic dev team should take note of it and do their own diligence. Perhaps updating a unit test to cover this particular query combination, for example.

We would love to. We just need some real data for that.

I suppose you have real data you already test the functionality of elasticsearch with?

For example, if I were building a search engine I would need to ensure that it does things like
A) Works at scale
B) performs well with millions of records
C) Can faithfully execute the search algebra, etc

A quick place to check if this is a bug in your product is to run a similar query against your own data indexes. If it works there, that would be good for me and community to know so we can focus on where the problem might be.

As I noted before, just the data alone does not reveal the issue. But the issue is there nonetheless.

We do have plenty of tests.

Or it's something else. Like a bad mapping which does not exactly produce what you believe...
That's why a real example of your problem would be a great help to check whether it's a bug (then it will be added to the test suite) or a misusage in which case we would be happy to help you to fix it.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.