In my example, I have an object I am indexing which is similar to a
discussion thread. The 'content' proeprty will contain text which may be in
English or Arabic.
If the JSON document I am indexing can determine which language it is using,
can an analyzer be chosen at index and search time?
I don't know much about mappings yet, but the multi-type approach worries me
because the 'content' field will be knowingly indexed once with the correct
analyzer and once with the incorrect analyzer.
It appears from the doc entry that the query is then performed only against
the 'default' entry in the multi-type instead of applying against all
multi-type entries. This makes it a bit harder to manage queries I think. If
multi-type is the only way to be able to search for multilingual text in a
field, I suppose I will have to adapt.
A quick search shows there are some analyzers out there that have been
developed for this problem. (i.e.
http://www.sematext.com/products/multilingual-indexer/index.html) In the
docs there is a list of built in analyzers. Is it straightforward to include
and configure other analyzers? Any pointers to docs?
On Fri, Aug 27, 2010 at 7:34 AM, Clinton Gormley firstname.lastname@example.org:
On Fri, 2010-08-27 at 13:33 +0300, Shay Banon wrote:
No, you can't specify different analyzers on the same field.
But you can index the same field twice, as a multi field, with different
On Thu, Aug 26, 2010 at 4:31 PM, James Cook email@example.com
Circling around to my earlier question, can I have an English
and Arabic analyzer specified on the same fields across
On Wed, Aug 25, 2010 at 2:57 PM, Shay Banon
On Wed, Aug 25, 2010 at 9:13 PM, Andrei
On Aug 24, 4:45 pm, Shay Banon
> Yes, you can create your own analyzer and
add to it the asciifolding filter.
> The ICU plugin might also be interesting for
Do you mean to create one in Java or in the
Its in a configuration file. You create a custom
analyzer that include it.
> It depends how far you want to take it.
There are specific analyzers for
> different languages. I updated the docs to
Could you link to the page that you updated? I
couldn't find the
references to non-English languages there.
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.