Search across fields using different custom index analyzer


(Stephanie) #1

I have an index containing multiple types, on some of those types, some
fields are using a custom analyzer. This implementation was adopted to
allow a case-insensitive search allowing matches on special characters.
However, when I execute a query string search across all
fields(without specifying a field or default_field), all fields seem to be
matched as if they are all using the standard analyzer. My items with
special characters are only matched if I use a space instead of the special
character. Searches specifying a particular field work fine... matching on
special characters.

Is there an adjustment to the mapping or query construction that I can make
to enable searches using different analyzers on different fields?

  • Thanks

--


(Ivan Brusic) #2

When searching across "all fields", you are actually searching a field
called _all.

http://www.elasticsearch.org/guide/reference/mapping/all-field.html

This field has it own analyzer, which defaults to the StandardAnalyzer. You
can redefine the analyzer for the _all field in your mapping.

Another option is to use the multi field option of the query string query
and explicitly search against multiple fields.

Cheers,

Ivan

On Tue, Oct 9, 2012 at 8:20 AM, Stephanie stephanie.barulic@gmail.comwrote:

I have an index containing multiple types, on some of those types, some
fields are using a custom analyzer. This implementation was adopted to
allow a case-insensitive search allowing matches on special characters.
However, when I execute a query string search across all
fields(without specifying a field or default_field), all fields seem to be
matched as if they are all using the standard analyzer. My items with
special characters are only matched if I use a space instead of the special
character. Searches specifying a particular field work fine... matching on
special characters.

Is there an adjustment to the mapping or query construction that I can
make to enable searches using different analyzers on different fields?

  • Thanks

--

--


(Stephanie) #3

_all sounds very promising. However, while I seem to be able to
successfully update the type mapping in an index with a custom analyzer, it
doesn't seem to change the behavior of a query against the _all field. Is
there a trick to getting information about the _all field to come back with
the metadata on a mapping get? I cannot seem to find a way to verify that
I've actually updated the analyzer.

  • Thanks for your help

On Tuesday, October 9, 2012 11:20:03 AM UTC-4, Stephanie wrote:

I have an index containing multiple types, on some of those types, some
fields are using a custom analyzer. This implementation was adopted to
allow a case-insensitive search allowing matches on special characters.
However, when I execute a query string search across all
fields(without specifying a field or default_field), all fields seem to be
matched as if they are all using the standard analyzer. My items with
special characters are only matched if I use a space instead of the special
character. Searches specifying a particular field work fine... matching on
special characters.

Is there an adjustment to the mapping or query construction that I can
make to enable searches using different analyzers on different fields?

  • Thanks

--


(Stephanie) #4

I'm having trouble making this approach work in a way that seems consistent
between the _all field and the other fields I have mapped to the custom
analyzer.

(Again, if there is a more straightforward approach to
enabling special character matching and preserving case-insensitive
matching, I would be happy to give something else a shot. I am trying to
avoid constructing a query against every field on all of our types.)

First, I should say that while I am seemingly able to update the analyzer
on the _all field successfully, I can't seem to get anything back in the
metadata for the mapping indicating that the non-default analyzer has been
applied. Any suggestion there would be fantastic. I've attempted to make
sure both the _all and the name queries are using the standard
search_analyzer. The inconsistency around behavior has to do with
matching partial strings with multiple words. Any clues you might be able
to give as to what might explain the differences in behavior would be
welcome.

For instance, while searching for an entity with name="l&g foods"

If I execute this query against the name field where the name field is
using the custom analyzer but the _all field is not, I get a matching
result.

{"query":{"query_string":{"default_field":"name
","default_operator":"AND","query":"l&g\ foo*"}},"fields":["name"]}

However, if I execute the same query against the _all field in an index
where both the _all field and the name field are using the custom analyzer,
I do not get matching results.

{"query":{"query_string":{"default_field":"_all
","default_operator":"AND","query":"l&g\ foo*"}},"fields":["name"]}

Instead, I have to run the query with an unescaped space...

{"query":{"query_string":{"default_field":"_all
","default_operator":"AND","query":"l&g foo*"}},"fields":["name"]}

When querying the index with _all set to the custom analyzer for _all, I
cannot seem to construct a query to match the single term with
the special character while still using a trailing wildcard...

{"query":{"query_string":{"default_field":"_all
","default_operator":"AND","query":"l&g*"}},"fields":["name"]} will
return not results. However,

{"query":{"query_string":{"default_field":"_all
","default_operator":"AND","query":"l&g"}},"fields":["name"]} will match.

Finally, leading wildcards seem to behave differently. my _all searches
seems to disallow them... no matches, while they seem to be required in the
name field searches, as I would expect, to find words in the middle of the
field text. It's almost as if the _all field analyzer is still splitting
the field into tokens.

Thank you again.

On Tuesday, October 9, 2012 12:13:54 PM UTC-4, Ivan Brusic wrote:

When searching across "all fields", you are actually searching a field
called _all.

http://www.elasticsearch.org/guide/reference/mapping/all-field.html

This field has it own analyzer, which defaults to the StandardAnalyzer.
You can redefine the analyzer for the _all field in your mapping.

Another option is to use the multi field option of the query string query
and explicitly search against multiple fields.

Cheers,

Ivan

On Tue, Oct 9, 2012 at 8:20 AM, Stephanie <stephani...@gmail.com<javascript:>

wrote:

I have an index containing multiple types, on some of those types, some
fields are using a custom analyzer. This implementation was adopted to
allow a case-insensitive search allowing matches on special characters.
However, when I execute a query string search across all
fields(without specifying a field or default_field), all fields seem to be
matched as if they are all using the standard analyzer. My items with
special characters are only matched if I use a space instead of the special
character. Searches specifying a particular field work fine... matching on
special characters.

Is there an adjustment to the mapping or query construction that I can
make to enable searches using different analyzers on different fields?

  • Thanks

--

--


(system) #5