My team is in the process of migrating an application from a custom Lucene
based solution to elasticsearch and we have some questions about the
keyword analyzer. In our "legacy" system we are indexing a field
using org.apache.lucene.analysis.KeywordAnalyzer for searching and
indexing.
According to the documentation: (
http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html
)
An analyzer of type keyword that “tokenizes” an entire stream as a single
token. This is useful for data like zip codes, ids and so on. Note, when
using mapping definitions, it make more sense to simply mark the field as
not_analyzed.
What advantage does using not_analyzed have over analyzing it as a keyword?
Is it functionally equivalent?
--
Hello!
They are not totally equivalent. With keyword analyzer you can add additional filters, like lowercase one for example and have your text as a single term in the index although analyzed a bit. When setting the field to not_analyzed you'll have the field as it is.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
My team is in the process of migrating an application from a custom Lucene based solution to elasticsearch and we have some questions about the keyword analyzer. In our "legacy" system we are indexing a field using org.apache.lucene.analysis.KeywordAnalyzer for searching and indexing.
According to the documentation: (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html)
An analyzer of type keyword that “tokenizes” an entire stream as a single token. This is useful for data like zip codes, ids and so on. Note, when using mapping definitions, it make more sense to simply mark the field as not_analyzed.
What advantage does using not_analyzed have over analyzing it as a keyword? Is it functionally equivalent?
--
Hello!
Sorry, you are talking about keyword analyzer, not the tokenizer (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-tokenizer.html). In that Clinton is right.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
Hello!
They are not totally equivalent. With keyword analyzer you can add additional filters, like lowercase one for example and have your text as a single term in the index although analyzed a bit. When setting the field to not_analyzed you'll have the field as it is.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
My team is in the process of migrating an application from a custom Lucene based solution to elasticsearch and we have some questions about the keyword analyzer. In our "legacy" system we are indexing a field using org.apache.lucene.analysis.KeywordAnalyzer for searching and indexing.
According to the documentation: (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html)
An analyzer of type keyword that “tokenizes” an entire stream as a single token. This is useful for data like zip codes, ids and so on. Note, when using mapping definitions, it make more sense to simply mark the field as not_analyzed.
What advantage does using not_analyzed have over analyzing it as a keyword? Is it functionally equivalent?
--
--