I am using langdetect plugin to dynamically assign the analyzer at index
time.
PUT test
POST test/article/_mapping
{
"article" : {
"_analyzer" : {
"path" : "description.lang"
},
"properties" : {
"description" : { "type" : "langdetect" }
}
}
}
Langdetect plugin detects the language as 'en', 'fr', 'de', and so on. so
the analyzers should be defined as 'en', etc. This make them less
descriptive and the context of analyzer is lost. Is it possible to derive a
more descriptive name, such that _analyzer is resolve to 'en_icu_analyzer',
instead of just 'en'?
Something like... (this does not work), this is just what i want to
achieve.
This requires a change in the langdetect plugin, feel free to open an issue
at
It should be trivial to wrap something like a String.format() around the
language name by config request.
Jörg
On Thu, Aug 28, 2014 at 8:42 AM, Nitin Maheshwari ask4nitin@gmail.com
wrote:
Hi,
I am using langdetect plugin to dynamically assign the analyzer at index
time.
PUT test
POST test/article/_mapping
{
"article" : {
"_analyzer" : {
"path" : "description.lang"
},
"properties" : {
"description" : { "type" : "langdetect" }
}
}
}
Langdetect plugin detects the language as 'en', 'fr', 'de', and so on. so
the analyzers should be defined as 'en', etc. This make them less
descriptive and the context of analyzer is lost. Is it possible to derive a
more descriptive name, such that _analyzer is resolve to 'en_icu_analyzer',
instead of just 'en'?
Something like... (this does not work), this is just what i want to
achieve.
On Thursday, 28 August 2014 12:12:26 UTC+5:30, Nitin Maheshwari wrote:
Hi,
I am using langdetect plugin to dynamically assign the analyzer at index
time.
PUT test
POST test/article/_mapping
{
"article" : {
"_analyzer" : {
"path" : "description.lang"
},
"properties" : {
"description" : { "type" : "langdetect" }
}
}
}
Langdetect plugin detects the language as 'en', 'fr', 'de', and so on. so
the analyzers should be defined as 'en', etc. This make them less
descriptive and the context of analyzer is lost. Is it possible to derive a
more descriptive name, such that _analyzer is resolve to 'en_icu_analyzer',
instead of just 'en'?
Something like... (this does not work), this is just what i want to
achieve.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.