Analyzer for generating keywords

Hi all,

I have a potentially stupid noob question about analyzers.

Is it possible to create an analyzer to populate a specified field in all
records automatically? Specifically, I want to create a field with keywords
based on a text field in that same record. I want to be able to do a check
on which keywords were generated for each record, to verify what the
analyzer is doing. Or do analyzers only work on indexing, and the results
can't be inspected directly?

Also, when an analyzer is applied to an index, does this also apply to
existing records, or do these records need to be reindexed? If so, how is
the reindexing triggered?

Thanks!

--

Hey Anton,

On Thursday, November 8, 2012 9:53:54 AM UTC+1, Anton wrote:

Hi all,

I have a potentially stupid noob question about analyzers.

In general an analyzer is usually used per field and the analyzer is used
at index and at query time. So if you index a document elasticsearch
chooses an analyzer and passes your text through the analyzer. The created
tokens are then used to build the inverted index (
http://en.wikipedia.org/wiki/Inverted_index). At query time the same thing
happens, we pass the text through the analyzer and build the query based on
the created tokens.

Is it possible to create an analyzer to populate a specified field in all
records automatically? Specifically, I want to create a field with keywords
based on a text field in that same record. I want to be able to do a check
on which keywords were generated for each record, to verify what the
analyzer is doing. Or do analyzers only work on indexing, and the results
can't be inspected directly?

I am not sure if I understand this question. Each field in ES can have its
own index / query analyzer and you can configure which field uses which
analyzer via the mapping API. (
http://www.elasticsearch.org/guide/reference/mapping/) The analyzers can
also be defined freely via a REST endpoint when you create the index (
http://www.elasticsearch.org/guide/reference/index-modules/analysis/).

Also, when an analyzer is applied to an index, does this also apply to
existing records, or do these records need to be reindexed? If so, how is
the reindexing triggered?

you can't change your analyzers once they have been configured otherwise we
would need to entriely reindex and that is sometimes not possible ie. if we
need the data from a 3rd party resource. you should configure your analysis
ahead of time. Yet, if you want to get more insight into what happens when
you pass a certain string to ES you can checkout the analyze API (
http://www.elasticsearch.org/guide/reference/api/admin-indices-analyze.html)
that basically applies an analyzer to text and returns you the result as a
json response.

simon

Thanks!

--