Default analyzer when the given analyzer not found?

Han_2 · March 25, 2013, 5:48pm

I have an index with analyzer settings as shown below..

{
"settings": {
"analysis": {
"analyzer": {
"default": { "type": "english" },
"ar": { "type": "arabic" },
"hy":{ "type": "armenian" },
...
}
}
}
}

and a type mapping as shown below

{
"type1" : {
"_analyzer" : {
"path" : "language"
},
"properties" : {
"id" : { "type" : "string", "index" : "not_analyzed" },
"name" : { "type" : "string" },
"language" : { "type" : "string", "index" : "not_analyzed" }
}
}
}

Language can be any language and it might not have a valid anlyzer mapping
too... so my question is "is there anyway we can specify settings such that
ElasticSearch uses 'default' analyzer when there is no matching analyzer
found?" currently i am getting "No analyzer found" error message..

I could actually list out all of the languages and define an analyzer for
each of them but in our case the language list keep changing... it would
nice to have to have a default analyzer when there is NO matching analyzer.

I would really appreciate any suggestions.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Igor_Motov · March 26, 2013, 2:57pm

That sounds like a useful feature. I would suggest creating an issue for
it.

On Monday, March 25, 2013 1:48:32 PM UTC-4, Han wrote:

I have an index with analyzer settings as shown below..

{
"settings": {
"analysis": {
"analyzer": {
"default": { "type": "english" },
"ar": { "type": "arabic" },
"hy":{ "type": "armenian" },
...
}
}
}
}

and a type mapping as shown below

{
"type1" : {
"_analyzer" : {
"path" : "language"
},
"properties" : {
"id" : { "type" : "string", "index" : "not_analyzed" },
"name" : { "type" : "string" },
"language" : { "type" : "string", "index" : "not_analyzed" }
}
}
}

Language can be any language and it might not have a valid anlyzer mapping
too... so my question is "is there anyway we can specify settings such that
Elasticsearch uses 'default' analyzer when there is no matching analyzer
found?" currently i am getting "No analyzer found" error message..

I could actually list out all of the languages and define an analyzer for
each of them but in our case the language list keep changing... it would
nice to have to have a default analyzer when there is NO matching analyzer.

I would really appreciate any suggestions.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Frederic_Meyer · February 25, 2014, 1:57pm

Hey there.

Nearly one year after this initial post, I'm running into the exact same
issue, even though ES is now released (1.0).

Has anybody found a proper solution within ES? I've spent like 1 hour
searching for this, without any luck.

The only ugly workaround that I can think of right now is deal with a fall
back language at the data level i.e. before sending documents to be indexed
by ES.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ee9e7c0d-8022-4c35-bdde-4e194be1da98%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

brian_yoder · February 25, 2014, 4:09pm

Based on posts to this newsgroup early on in my usage of ES (over a year
now!), I used to put the following in my elasticsearch.yml file. Any field
that was not explicitly assigned an analyzer and that was deemed by ES to
be a string would pick up English snowball analyzer with no stop words (my
preference at the time):

index:
analysis:
analyzer:
# set stemming analyzer with no stop words as the default
default:
type: snowball
language: English
stopwords: none
filter:
stopWordsFilter:
type: stop
stopwords: none

But since then, I've long abandoned this default approach. Instead, I
explicitly assigned an analyzer to each and every field (you know, like a
real database!). And then my elasticsearch.yml file now contains the
following:

Do not automatically create an index when a document is loaded, and do

not automatically index unknown (unmapped) fields:

action.auto_create_index: false
index.mapper.dynamic: false

Therefore, I cannot automatically create an index during a load (which
would then create a useless index without any of the analyzers and mappings
I've carefully crafted). And I cannot get ES to automatically create a new
field; this is very helpful when someone uses a low-level tool such as
curl, and misspells a field name; ES will no longer create, for example,
the givveName field when it should have been givenName.

Brian

On Tuesday, February 25, 2014 8:57:30 AM UTC-5, Frederic Meyer wrote:

Hey there.

Nearly one year after this initial post, I'm running into the exact same
issue, even though ES is now released (1.0).

Has anybody found a proper solution within ES? I've spent like 1 hour
searching for this, without any luck.

The only ugly workaround that I can think of right now is deal with a fall
back language at the data level i.e. before sending documents to be indexed
by ES.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f1dbdc3-299a-46fa-855f-a34c74497c43%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Frederic_Meyer · February 25, 2014, 4:19pm

Ah yes, via the default in the yaml configuration file, of course. I'll
give that a try, thanks!

It is a pity though that the "default" analyzer doesn't seem to do his job
of processing all unmatched document as far as the _analyze field is
concerned.

Thanks
Fred

P.S. : I do understand your position about not indexing documents for which
you haven't craft a dedicated analyzer yet. Makes real sense.

On Tuesday, February 25, 2014 5:09:43 PM UTC+1, InquiringMind wrote:

Based on posts to this newsgroup early on in my usage of ES (over a year
now!), I used to put the following in my elasticsearch.yml file. Any field
that was not explicitly assigned an analyzer and that was deemed by ES to
be a string would pick up English snowball analyzer with no stop words (my
preference at the time):

index:
analysis:
analyzer:
# set stemming analyzer with no stop words as the default
default:
type: snowball
language: English
stopwords: none
filter:
stopWordsFilter:
type: stop
stopwords: none

But since then, I've long abandoned this default approach. Instead, I
explicitly assigned an analyzer to each and every field (you know, like a
real database!). And then my elasticsearch.yml file now contains the
following:

Do not automatically create an index when a document is loaded, and do

not automatically index unknown (unmapped) fields:

action.auto_create_index: false
index.mapper.dynamic: false

Therefore, I cannot automatically create an index during a load (which
would then create a useless index without any of the analyzers and mappings
I've carefully crafted). And I cannot get ES to automatically create a new
field; this is very helpful when someone uses a low-level tool such as
curl, and misspells a field name; ES will no longer create, for example,
the givveName field when it should have been givenName.

Brian

On Tuesday, February 25, 2014 8:57:30 AM UTC-5, Frederic Meyer wrote:

Hey there.

Nearly one year after this initial post, I'm running into the exact same
issue, even though ES is now released (1.0).

Has anybody found a proper solution within ES? I've spent like 1 hour
searching for this, without any luck.

The only ugly workaround that I can think of right now is deal with a
fall back language at the data level i.e. before sending documents to be
indexed by ES.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0fdb30b-d63a-4679-899a-36b45c788d8d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Upgrading from ES 1.7 to 5.4; default mapping analyzer Elasticsearch	5	511	August 25, 2017
Analyzer not found Elasticsearch	5	10097	July 6, 2017
Setting default analyzer for everything Elasticsearch	3	442	July 6, 2017
Analyzer defined in index settings doesn't apply to fields Elasticsearch	1	477	January 13, 2018
Why elasticsearch defined default analyzer doesn't work? Elasticsearch	1	361	July 6, 2017

Default analyzer when the given analyzer not found?

Do not automatically create an index when a document is loaded, and do

not automatically index unknown (unmapped) fields:

Do not automatically create an index when a document is loaded, and do

not automatically index unknown (unmapped) fields:

Related topics