Adding NGram to language analyzer


(Shamun) #1

Hi,
I use the built-in Arabic analyzer to index my Arabic text.
I want to add auto complete feature to my search, so I thought about adding
NGram filter.
Is it possible to extend existing analyzer?
If no, what is the configuration of the Arabic analyzer?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

You can create and configure your own custom analyzers and use them inside
of your mapping, see
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/

Note, you cannot change the analyzer configuration of a field, which has
already data indexed. So you either need to recreate your index or use the
so-called multi-field functionality. See
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

Hope this helps.

--Alex

On Mon, Jul 29, 2013 at 8:42 AM, Ido Shamun idoesh1@gmail.com wrote:

Hi,
I use the built-in Arabic analyzer to index my Arabic text.
I want to add auto complete feature to my search, so I thought about adding
NGram filter.
Is it possible to extend existing analyzer?
If no, what is the configuration of the Arabic analyzer?

Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Shamun) #3

I know how to create custom analyzer and use multi-field.
But I don't know what the Arabic analyzer configuration is. I found some
references which aren't exact and give different indexing compared to the
built-in one.
On Jul 30, 2013 9:58 AM, "Alexander Reelsen" alr@spinscale.de wrote:

Hey,

You can create and configure your own custom analyzers and use them inside
of your mapping, see
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/

Note, you cannot change the analyzer configuration of a field, which has
already data indexed. So you either need to recreate your index or use the
so-called multi-field functionality. See
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

Hope this helps.

--Alex

On Mon, Jul 29, 2013 at 8:42 AM, Ido Shamun idoesh1@gmail.com wrote:

Hi,
I use the built-in Arabic analyzer to index my Arabic text.
I want to add auto complete feature to my search, so I thought about
adding
NGram filter.
Is it possible to extend existing analyzer?
If no, what is the configuration of the Arabic analyzer?

Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Ecu4p8Usj-o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Lukáš Vlček) #4

Hi,

may be this can help?

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/analysis/ArabicAnalyzerProvider.java
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/analysis/common/src/java/org/apache/lucene/analysis/ar/ArabicAnalyzer.java

Both links refer to the master/trunk.

Regards,
Lukas

On Tue, Jul 30, 2013 at 9:06 AM, Ido Shamun idoesh1@gmail.com wrote:

I know how to create custom analyzer and use multi-field.
But I don't know what the Arabic analyzer configuration is. I found some
references which aren't exact and give different indexing compared to the
built-in one.
On Jul 30, 2013 9:58 AM, "Alexander Reelsen" alr@spinscale.de wrote:

Hey,

You can create and configure your own custom analyzers and use them
inside of your mapping, see
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/

Note, you cannot change the analyzer configuration of a field, which has
already data indexed. So you either need to recreate your index or use the
so-called multi-field functionality. See
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

Hope this helps.

--Alex

On Mon, Jul 29, 2013 at 8:42 AM, Ido Shamun idoesh1@gmail.com wrote:

Hi,
I use the built-in Arabic analyzer to index my Arabic text.
I want to add auto complete feature to my search, so I thought about
adding
NGram filter.
Is it possible to extend existing analyzer?
If no, what is the configuration of the Arabic analyzer?

Thanks!

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Ecu4p8Usj-o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Panagiotis Nikitopoulos) #5

I have the exact same problem with greek language.
Have you figured out how to solve it?
Thanks!

On Monday, July 29, 2013 9:42:09 AM UTC+3, Ido Shamun wrote:

Hi,
I use the built-in Arabic analyzer to index my Arabic text.
I want to add auto complete feature to my search, so I thought about adding
NGram filter.
Is it possible to extend existing analyzer?
If no, what is the configuration of the Arabic analyzer?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c95e5999-b920-497b-ab04-33423e7ceed8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Nik Everett) #6

On Thu, May 29, 2014 at 4:05 PM, Panagiotis Nikitopoulos <
panosbobos@gmail.com> wrote:

I have the exact same problem with greek language.
Have you figured out how to solve it?
Thanks!

First build a copy of the greek analyzer as a custom analyzer. Have a look
at
https://gerrit.wikimedia.org/r/#/c/130646/1/default_analyzer_config.json,cm
which contains an analyzer called custon_greek which is exactly that. Copy
that analzyer and all the filters that it needs (also in that file) from
into your configuration. Rename as you like. Verify that it works as you
expect. Start hacking on it to add your changes.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3D3PpsLcUo%2BvS81UGKXRJd3GeTPtUzYxUYnX7N8xHxhQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7