Strange inconsistency in searcing using booleans

Great stuff. I rewrote most of my requests to use only filters. I need to
sort on a field, but mostly never need to leverage scoring.

Thanks a lot!

On Monday, May 7, 2012 4:30:22 AM UTC-4, Clinton Gormley wrote:

Hi Jim

On Sat, 2012-05-05 at 11:06 -0700, James Cook wrote:

Thanks Clinton, I'm writing some basic unit tests to get a better
understanding of term vs. field. Is there any general purpose guidance
you can add to when one would use a field search/filter versus a term
search/filter. Is a term search meant to be run against only
properties that are not analyzed? I am just not clear on what the
diference is between the two.

First point: it's a field QUERY, not filter.

A 'term' filter or query does exact matching only. There is no analysis.

A 'text' query analyzes the search keywords, using the search_analyzer
which is defined for a particular field.

A 'field' query is similar to a text query, except it also takes the
Lucene Query Parser Syntax into account
Apache Lucene - Query Parser Syntax

If you have a field which is marked as 'not_analyzed' or analyzer:
'keyword' then the 'text' or 'field' queries will take that into account
and will be the equivalent of a term query. However, the 'field' query's
lucene syntax may interfere with things, so best to only use that where
you really want it.

So lets say that you decide to go with 'term' clauses. Next question is:
should they be filters or queries? To answer that, ask yourself:

    should the results of this clause be boolean (ie either include 
    or exclude results)  --> filter 
    
    or 
    
    should they affect the scoring (the more terms that match, the 
    higher the relevance) --> query 

hth

clint

A term filter matches only exact terms, so no analysis is done.

A field QUERY (note the fitler do

-- jim

On Monday, April 30, 2012 1:59:12 AM UTC-4, Clinton Gormley wrote:
Hiya James

    This works for me too. Are you sure you're not using the 
    'idea' field in 
    a different type with a different mapping? 
    
    On Sun, 2012-04-29 at 20:30 -0400, James Cook wrote: 
    > Unfortunately, it is a much more complicated structure than 
    I have 
    > represented here and there are lots of different types. 
    Perhaps I can 
    > get some advice regarding a couple aspects of the general 
    problem? 
    >      1. If I have properties in other types also called 
    'idea' can I 
    >         expect strange behavior if they are mapped in a 
    manner 
    >         different than the property in this type? (Are 
    property 
    >         mappings specific to 'type'?) 
    
    Right - you can get weird results.  Property mappings can 
    interfere with 
    each other.  I'm unsure of the exact mechanism, but I if you 
    name the 
    field unequivocally, then it should do the right thing, eg 
    including the 
    type name in the field name 'mytype.idea' 
    
    Otherwise, it can resolve the name 'idea' to the wrong 
    mapping. 
    
    >      1. When querying or filtering by a boolean value, 
    should one use 
    >         a field or a term? 
    
    You should be able to use either. 'field' sees that the 
    property is 
    mapped as boolean, and does the right thing. 
    
    >      1. I suppose one should almost never do a query for a 
    boolean 
    >         value as it most likely would not impact score 
    unless it was 
    >         in a 'should' clause. Should boolean conditions most 
    often be 
    >         included in a filter and very rarely included in a 
    query? 
    
    As you say, it depends whether you want them to affect the 
    score or not. 
    That's the general rule.  Sometimes, depending on your query 
    and how 
    often you use the exact combination of values in that query, 
    it *may* be 
    more efficient to use a bool query rather than several 
    filters.  But 
    whether it is better or not depends very much on what you're 
    doing, and 
    you should test how each performs before you make a decision 
    on that.   
    
    Also, see the 'bool' execution mode of the 'terms' filter, 
    which (again, 
    depending on your exact query) may be more efficient still: 

Elasticsearch Platform — Find real-time answers at scale | Elastic

    clint 
      
    > 
    > On Sun, Apr 29, 2012 at 12:47 PM, Shay Banon 
    <kimchy@gmail.com> wrote: 
    >         Can you post a full recreation? This seems to 
    >         work: https://gist.github.com/2551823. 
    >         
    >         On Fri, Apr 27, 2012 at 4:20 PM, James Cook 
    <jcook@pykl.com> 
    >         wrote: 
    >                 I just recently upgraded from 0.18 to 0.19.2 
    and 
    >                 noticed this new behavior creep into my 
    codebase, 
    >                 although I would have to consider it a bug. 
    >                 
    >                 
    >                 I have a mapping file with quite a few 
    different 
    >                 properties, but here is a subset containing 
    some 
    >                 boolean values: 
    >                 
    >                 
    >                 $ curl -XGET 
    > 
    'http://localhost:9311/nep/ventures/_mapping?pretty=true' 
    >                 { 
    >                   "ventures" : { 
    >                     "_id" : { 
    >                       "index" : "not_analyzed" 
    >                     }, 
    >                     "properties" : { 
    >                       "active" : { 
    >                         "type" : "boolean", 
    >                         "index" : "not_analyzed" 
    >                       }, 
    >                       "idea" : { 
    >                         "type" : "boolean", 
    >                         "index" : "not_analyzed" 
    >                       }, 
    >                       "serviceProvider" : { 
    >                         "type" : "boolean", 
    >                         "index" : "not_analyzed" 
    >                       } 
    >                     } 
    >                   } 
    >                 } 
    >                 
    >                 
    >                 After inserting a few dozen sample records, 
    I perform 
    >                 some queries: 
    >                 { "query" : { 
    >                       "bool" : { 
    >                           "must" : [ 
    >                               { "field" : { "active" : 
    true } }, 
    >                               { "field" : { "idea" : 
    false } }, 
    >                               { "field" : 
    { "serviceProvider" : 
    >                 true } } 
    >                             ] 
    >                         } 
    >                     } 
    >                 } 
    >                 
    >                 
    >                 
    >                 Any time I specify { "field" : { "idea" : 
    false } } I 
    >                 get 0 results, but when I change this to a 
    term 
    >                 search, I see my expected results. None of 
    the other 
    >                 boolean fields require this stipulation and 
    work the 
    >                 same way whether I specify term or field. In 
    0.18, a 
    >                 term search was not required for the 'idea' 
    property. 
    >                 
    >                 
    >                 Is this expected behavior, or is something 
    else 
    >                 interfering with my mapping/queries? 
    >         
    >         
    > 
    >