Request information on Suggester Plugin


(Benoît) #1

Hello,

Despite what I said here :
https://groups.google.com/d/msg/elasticsearch/GBvbTx3t2Jk/QQzFP4T74icJ
I'd like to have some more information about the suggester plugin.

Is it possible to fully deactivate automatic refresh ? If i want to refresh
once a day i prefer to chose the time.

_suggestRefresh accept a field options. Could it be possible to configure
that we want suggest on only one field ? I suppose it could save resources
(disk and cpu)

About multi-index / multi-type format in urls, the plugin seems to accept
the _all and * notation for index but type is required. Is there any reason
for this ?

Finally i'd like to understand what append when the plugin is installed,
does it build a special file for suggest for all index all fields ?

Thanks.

Benoît

--


(Alexander Reelsen) #2

Hi

On Fri, Oct 12, 2012 at 11:22 AM, Benoît benoit.intrw@gmail.com wrote:

Is it possible to fully deactivate automatic refresh ? If i want to refresh
once a day i prefer to chose the time.
Not right now, but that's easy to implement. If you need it, create a
github issue and I'll build it.

_suggestRefresh accept a field options. Could it be possible to configure
that we want suggest on only one field ? I suppose it could save resources
(disk and cpu)
As long as you do not issue a suggest request for a field, no
resources are allocated by the plugin - allocation starts with the
first request. Please read below for more details about the
implementation.

About multi-index / multi-type format in urls, the plugin seems to accept
the _all and * notation for index but type is required. Is there any reason
for this ?
Actually, I am not making use of the type at the moment. I just added
it, so it looks similar to most other search requests.

Finally i'd like to understand what append when the plugin is installed,
does it build a special file for suggest for all index all fields ?
It does not build a special file (and does not store anything on
disk). The Lucene FST Suggester uses an in-memory structure to query
for suggestions. As soon as you request suggestions for any field,
this memory structure is created and updated periodically (also in
memory of course). Also the in-memory structure is per-field and not
per-index. If you only request suggestions for one field, there is
only one in-memory structure (per shard, of course).
As it is very time-consuming to update this in-memory structure on
every indexing of new data, it is updated periodically.

Regarding the "not yet ready for production" sign. I am running the
plugin since quite some time in production and I almost rewrote it
some time ago, because of a file descriptor leak. I will test it in
the next days, but I highly assume I removed the leak by rewriting big
portions of the plugins (by being as near as possible at the
elasticsearch architecture instead of writing my own stuff and hoping
it works - which it did not).

Hope this helps. In case you have any further questions or I forgot to
answer, feel free to ask.

Oh, and by the way - if the AnalyzingSuggester makes it into lucene
4.1, this plugin might not be needed at all anymore. See

Regards, Alexander

--


(Benoît) #3

Thank you very much for your answer.

I will create github issue for two points.

The details on implementation you give are really interesting.

Regards.

Benoît

On Monday, October 15, 2012 12:26:30 PM UTC+2, Alexander Reelsen wrote:

Hi

On Fri, Oct 12, 2012 at 11:22 AM, Benoît <benoit...@gmail.com<javascript:>>
wrote:

Is it possible to fully deactivate automatic refresh ? If i want to
refresh
once a day i prefer to chose the time.
Not right now, but that's easy to implement. If you need it, create a
github issue and I'll build it.

_suggestRefresh accept a field options. Could it be possible to
configure
that we want suggest on only one field ? I suppose it could save
resources
(disk and cpu)
As long as you do not issue a suggest request for a field, no
resources are allocated by the plugin - allocation starts with the
first request. Please read below for more details about the
implementation.

About multi-index / multi-type format in urls, the plugin seems to
accept
the _all and * notation for index but type is required. Is there any
reason
for this ?
Actually, I am not making use of the type at the moment. I just added
it, so it looks similar to most other search requests.

Finally i'd like to understand what append when the plugin is installed,
does it build a special file for suggest for all index all fields ?
It does not build a special file (and does not store anything on
disk). The Lucene FST Suggester uses an in-memory structure to query
for suggestions. As soon as you request suggestions for any field,
this memory structure is created and updated periodically (also in
memory of course). Also the in-memory structure is per-field and not
per-index. If you only request suggestions for one field, there is
only one in-memory structure (per shard, of course).
As it is very time-consuming to update this in-memory structure on
every indexing of new data, it is updated periodically.

Regarding the "not yet ready for production" sign. I am running the
plugin since quite some time in production and I almost rewrote it
some time ago, because of a file descriptor leak. I will test it in
the next days, but I highly assume I removed the leak by rewriting big
portions of the plugins (by being as near as possible at the
elasticsearch architecture instead of writing my own stuff and hoping
it works - which it did not).

Hope this helps. In case you have any further questions or I forgot to
answer, feel free to ask.

Oh, and by the way - if the AnalyzingSuggester makes it into lucene
4.1, this plugin might not be needed at all anymore. See
http://java.dzone.com/articles/lucenes-new-analyzing

Regards, Alexander

--


(system) #4