Can't get snowball analyzer to work via Java

Jason_5 · August 21, 2012, 12:30am

Hey folks,

This is either a misconfiguration by me, or a misunderstanding (or both)
but I'm struggling to get the snowball analyzer to work.

When I create my index(es) I am specifying the following settings:

ImmutableSettings.settingsBuilder().loadFromSource(jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("custom")
.field("tokenizer", "standard")
.field("filter", new String[]{"standard", "lowercase",
"snowball"})
.endObject()
.endObject()
.startObject("filter")
.startObject("snowball")
.field("type", "snowball")
.field("language", "English")
.endObject()
.endObject()
.endObject()
.endObject().string());

Then when I perform a search I am specifying the analyzer:

QueryBuilder query = queryString(queryString).analyzer("custom");

But it's not working (meaning, a search for the term "art" does not match
documents with the word "arts"). I have NOT yet manually added the
analyzer to the field definition in the mapping because I don't want to
define a specific language for the field (I don't know ahead of time what
language the content will be in.

I'm wondering if this is just the wrong approach. Do I HAVE to nominate a
specific analyser for a single field?, and if so how would one go about
supporting multiple languages? (multiple indexes I guess?)

What I'm really looking for is a full working example of a "sensible"
configuration for ElasticSearch which will give me all the basic free text
search features, like stemming. Although the "a la carte" approach is
great, it would be nice if the default implementation served the most
predominant use case(s).

Thanks!

--

dadoonet · August 21, 2012, 1:51am

I think that you have indexed arts with a default analyzer. So, when you search for art, you can't find it.
Specifying analyzer at search time means that your searched string is analyzed before being compared with the index. So Art is analyzed to art.

To make it work, you should apply your analyzer at index time. So, you need to define it on your field.

If you have multiple analyzers to apply to a field, I recommand to use the cool multifield feature.
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html

HTH

David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 août 2012 à 02:30, Jason jason.polites@gmail.com a écrit :

Hey folks,

This is either a misconfiguration by me, or a misunderstanding (or both) but I'm struggling to get the snowball analyzer to work.

When I create my index(es) I am specifying the following settings:

ImmutableSettings.settingsBuilder().loadFromSource(jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("custom")
.field("tokenizer", "standard")
.field("filter", new String[]{"standard", "lowercase", "snowball"})
.endObject()
.endObject()
.startObject("filter")
.startObject("snowball")
.field("type", "snowball")
.field("language", "English")
.endObject()
.endObject()
.endObject()
.endObject().string());

Then when I perform a search I am specifying the analyzer:

QueryBuilder query = queryString(queryString).analyzer("custom");

But it's not working (meaning, a search for the term "art" does not match documents with the word "arts"). I have NOT yet manually added the analyzer to the field definition in the mapping because I don't want to define a specific language for the field (I don't know ahead of time what language the content will be in.

I'm wondering if this is just the wrong approach. Do I HAVE to nominate a specific analyser for a single field?, and if so how would one go about supporting multiple languages? (multiple indexes I guess?)

What I'm really looking for is a full working example of a "sensible" configuration for ElasticSearch which will give me all the basic free text search features, like stemming. Although the "a la carte" approach is great, it would be nice if the default implementation served the most predominant use case(s).

Thanks!

--

Jason_5 · August 30, 2012, 3:16am

Hi David,

Sorry for the late reply.. I was out of town... just wanted to say thanks!

Jason.

On Monday, August 20, 2012 6:51:16 PM UTC-7, David Pilato wrote:

I think that you have indexed arts with a default analyzer. So, when you
search for art, you can't find it.
Specifying analyzer at search time means that your searched string is
analyzed before being compared with the index. So Art is analyzed to art.

To make it work, you should apply your analyzer at index time. So, you
need to define it on your field.

If you have multiple analyzers to apply to a field, I recommand to use the
cool multifield feature.
Elasticsearch Platform — Find real-time answers at scale | Elastic

HTH

David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 août 2012 à 02:30, Jason <jason....@gmail.com <javascript:>> a
écrit :

Hey folks,

This is either a misconfiguration by me, or a misunderstanding (or both)
but I'm struggling to get the snowball analyzer to work.

When I create my index(es) I am specifying the following settings:

ImmutableSettings.settingsBuilder().loadFromSource(jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("custom")
.field("tokenizer", "standard")
.field("filter", new String{"standard", "lowercase",
"snowball"})
.endObject()
.endObject()
.startObject("filter")
.startObject("snowball")
.field("type", "snowball")
.field("language", "English")
.endObject()
.endObject()
.endObject()
.endObject().string());

Then when I perform a search I am specifying the analyzer:

QueryBuilder query = queryString(queryString).analyzer("custom");

But it's not working (meaning, a search for the term "art" does not match
documents with the word "arts"). I have NOT yet manually added the
analyzer to the field definition in the mapping because I don't want to
define a specific language for the field (I don't know ahead of time what
language the content will be in.

I'm wondering if this is just the wrong approach. Do I HAVE to nominate a
specific analyser for a single field?, and if so how would one go about
supporting multiple languages? (multiple indexes I guess?)

What I'm really looking for is a full working example of a "sensible"
configuration for Elasticsearch which will give me all the basic free text
search features, like stemming. Although the "a la carte" approach is
great, it would be nice if the default implementation served the most
predominant use case(s).

Thanks!

--

--

Topic		Replies	Views
Snowball Analyzer and Java API Elasticsearch	7	707	July 6, 2017
Snowball Analyzer not working in Elasticsearch 5.0 Elasticsearch	5	914	January 27, 2017
Floundering with snowball analyzer setup Elasticsearch	4	633	July 5, 2017
Search analyzer not being applied Elasticsearch	4	1559	July 6, 2017
Snowball (expected) search Elasticsearch	3	272	July 6, 2017

Can't get snowball analyzer to work via Java

HTH

HTH

Related topics