Hi Jim
On Sat, 2012-05-05 at 11:06 -0700, James Cook wrote:
Thanks Clinton, I'm writing some basic unit tests to get a better
understanding of term vs. field. Is there any general purpose guidance
you can add to when one would use a field search/filter versus a term
search/filter. Is a term search meant to be run against only
properties that are not analyzed? I am just not clear on what the
diference is between the two.
First point: it's a field QUERY, not filter.
A 'term' filter or query does exact matching only. There is no analysis.
A 'text' query analyzes the search keywords, using the search_analyzer
which is defined for a particular field.
A 'field' query is similar to a text query, except it also takes the
Lucene Query Parser Syntax into account
http://lucene.apache.org/core/3_6_0/queryparsersyntax.html
If you have a field which is marked as 'not_analyzed' or analyzer:
'keyword' then the 'text' or 'field' queries will take that into account
and will be the equivalent of a term query. However, the 'field' query's
lucene syntax may interfere with things, so best to only use that where
you really want it.
So lets say that you decide to go with 'term' clauses. Next question is:
should they be filters or queries? To answer that, ask yourself:
should the results of this clause be boolean (ie either include
or exclude results) --> filter
or
should they affect the scoring (the more terms that match, the
higher the relevance) --> query
hth
clint
A term filter matches only exact terms, so no analysis is done.
A field QUERY (note the fitler do
-- jim
On Monday, April 30, 2012 1:59:12 AM UTC-4, Clinton Gormley wrote:
Hiya James
This works for me too. Are you sure you're not using the
'idea' field in
a different type with a different mapping?
On Sun, 2012-04-29 at 20:30 -0400, James Cook wrote:
> Unfortunately, it is a much more complicated structure than
I have
> represented here and there are lots of different types.
Perhaps I can
> get some advice regarding a couple aspects of the general
problem?
> 1. If I have properties in other types also called
'idea' can I
> expect strange behavior if they are mapped in a
manner
> different than the property in this type? (Are
property
> mappings specific to 'type'?)
Right - you can get weird results. Property mappings can
interfere with
each other. I'm unsure of the exact mechanism, but I if you
name the
field unequivocally, then it should do the right thing, eg
including the
type name in the field name 'mytype.idea'
Otherwise, it can resolve the name 'idea' to the wrong
mapping.
> 1. When querying or filtering by a boolean value,
should one use
> a field or a term?
You should be able to use either. 'field' sees that the
property is
mapped as boolean, and does the right thing.
> 1. I suppose one should almost never do a query for a
boolean
> value as it most likely would not impact score
unless it was
> in a 'should' clause. Should boolean conditions most
often be
> included in a filter and very rarely included in a
query?
As you say, it depends whether you want them to affect the
score or not.
That's the general rule. Sometimes, depending on your query
and how
often you use the exact combination of values in that query,
it *may* be
more efficient to use a bool query rather than several
filters. But
whether it is better or not depends very much on what you're
doing, and
you should test how each performs before you make a decision
on that.
Also, see the 'bool' execution mode of the 'terms' filter,
which (again,
depending on your exact query) may be more efficient still:
http://www.elasticsearch.org/guide/reference/query-dsl/terms-filter.html
clint
>
> On Sun, Apr 29, 2012 at 12:47 PM, Shay Banon
<kimchy@gmail.com> wrote:
> Can you post a full recreation? This seems to
> work: https://gist.github.com/2551823.
>
> On Fri, Apr 27, 2012 at 4:20 PM, James Cook
<jcook@pykl.com>
> wrote:
> I just recently upgraded from 0.18 to 0.19.2
and
> noticed this new behavior creep into my
codebase,
> although I would have to consider it a bug.
>
>
> I have a mapping file with quite a few
different
> properties, but here is a subset containing
some
> boolean values:
>
>
> $ curl -XGET
>
'http://localhost:9311/nep/ventures/_mapping?pretty=true'
> {
> "ventures" : {
> "_id" : {
> "index" : "not_analyzed"
> },
> "properties" : {
> "active" : {
> "type" : "boolean",
> "index" : "not_analyzed"
> },
> "idea" : {
> "type" : "boolean",
> "index" : "not_analyzed"
> },
> "serviceProvider" : {
> "type" : "boolean",
> "index" : "not_analyzed"
> }
> }
> }
> }
>
>
> After inserting a few dozen sample records,
I perform
> some queries:
> { "query" : {
> "bool" : {
> "must" : [
> { "field" : { "active" :
true } },
> { "field" : { "idea" :
false } },
> { "field" :
{ "serviceProvider" :
> true } }
> ]
> }
> }
> }
>
>
>
> Any time I specify { "field" : { "idea" :
false } } I
> get 0 results, but when I change this to a
term
> search, I see my expected results. None of
the other
> boolean fields require this stipulation and
work the
> same way whether I specify term or field. In
0.18, a
> term search was not required for the 'idea'
property.
>
>
> Is this expected behavior, or is something
else
> interfering with my mapping/queries?
>
>
>
>