Hi,
I had a query to search for certain fields which was scored using a custom
script. Then I needed to take account of missing data and found that I'd
need to use a filter for this. The solution query (simplified) is:
{
"query": {
"custom_filters_score": {
"query": {
"match_all": {}
},
"filters": [
{
"script": "((doc['id'].value == 'hi') ? 2 : 0) + 20",
"filter": {
"bool": {
"should": [
{
"term": {
"id": "hello"
}
},
{
"term": {
"id": "hi"
}
},
{
"missing": {
"field": "jurisdiction",
"existence": true,
"null_value": true
}
}
]
}
}
}
]
}
},
"size": 30
}
My problem is that the 'match_all' query will return all the data,
whereas I only want data matched by my filter to be returned. By doctoring
the scoring script I can ensure that filtered data is top of the list, but
I'm worried about performance.
My original query returned only data matching my terms, but I was unable
to use missing data with that. The original query is:
{
"query": {
"custom_score": {
"query": {
"bool": {
"should": [
{
"term": {
"id": "hi"
}
},
{
"term": {
"id": "hello"
}
}
]
}
},
"script": "((doc['id'].value == 'hi') ? 2 : 0)"
}
}
}
Please note that this is a simplified query, I need the script rather than
using a boost.
I had a query to search for certain fields which was scored using a
custom script. Then I needed to take account of missing data and found
that I'd need to use a filter for this. The solution query
(simplified) is:
My problem is that the 'match_all' query will return all the data,
whereas I only want data matched by my filter to be returned. By
doctoring the scoring script I can ensure that filtered data is top of
the list, but I'm worried about performance.
My original query returned only data matching my terms, but I was
unable to use missing data with that. The original query is:
I'm unclear from your examples exactly what your query should do. Could
you explain in english what you want to achieve?
Hi,
To clarify, in this example I want to return all documents where the term
"id" is either "hi" or "hello" or is missing altogether.
In the query I provided (without the filter) documents without the term
"id" were not returned, which was not what I wanted.
In the query that had the filter all documents were returned, which was
also not what I wanted, but I was able to bring the documents I wanted to
the top of the search by adding 20 to the score in the filter. This is not
ideal, and I'm worried about performance, even if I were to change the size
returned to 1.
Does that explain my predicament?
On Thursday, 11 October 2012 09:10:36 UTC+1, Clinton Gormley wrote:
Hi Amy
I had a query to search for certain fields which was scored using a
custom script. Then I needed to take account of missing data and found
that I'd need to use a filter for this. The solution query
(simplified) is:
My problem is that the 'match_all' query will return all the data,
whereas I only want data matched by my filter to be returned. By
doctoring the scoring script I can ensure that filtered data is top of
the list, but I'm worried about performance.
My original query returned only data matching my terms, but I was
unable to use missing data with that. The original query is:
I'm unclear from your examples exactly what your query should do. Could
you explain in english what you want to achieve?
To clarify, in this example I want to return all documents where the
term "id" is either "hi" or "hello" or is missing altogether.
In the query I provided (without the filter) documents without the
term "id" were not returned, which was not what I wanted.
In the query that had the filter all documents were returned, which
was also not what I wanted, but I was able to bring the documents I
wanted to the top of the search by adding 20 to the score in the
filter. This is not ideal, and I'm worried about performance, even if
I were to change the size returned to 1.
Does that explain my predicament?
Yes - much clearer
The exact structure of the query depends on what else you want to do,
but based on the description above, you could do this:
Thank you Clint, that was exactly what I was looking for!
On Thursday, October 11, 2012 2:10:22 PM UTC+1, Clinton Gormley wrote:
Hi Amy
To clarify, in this example I want to return all documents where the
term "id" is either "hi" or "hello" or is missing altogether.
In the query I provided (without the filter) documents without the
term "id" were not returned, which was not what I wanted.
In the query that had the filter all documents were returned, which
was also not what I wanted, but I was able to bring the documents I
wanted to the top of the search by adding 20 to the score in the
filter. This is not ideal, and I'm worried about performance, even if
I were to change the size returned to 1.
Does that explain my predicament?
Yes - much clearer
The exact structure of the query depends on what else you want to do,
but based on the description above, you could do this:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.