Help understanding keyword vs not_analyzed

My team is in the process of migrating an application from a custom Lucene
based solution to elasticsearch and we have some questions about the
keyword analyzer. In our "legacy" system we are indexing a field
using org.apache.lucene.analysis.KeywordAnalyzer for searching and
indexing.

According to the documentation: (

)

An analyzer of type keyword that “tokenizes” an entire stream as a single

token. This is useful for data like zip codes, ids and so on. Note, when
using mapping definitions, it make more sense to simply mark the field as
not_analyzed.

What advantage does using not_analyzed have over analyzing it as a keyword?
Is it functionally equivalent?

--

What advantage does using not_analyzed have over analyzing it as a
keyword? Is it functionally equivalent?

Yes, they are equivalent

clint

--

--

Hello!

They are not totally equivalent. With keyword analyzer you can add additional filters, like lowercase one for example and have your text as a single term in the index although analyzed a bit. When setting the field to not_analyzed you'll have the field as it is.

--

Regards,

Rafał Kuć

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

My team is in the process of migrating an application from a custom Lucene based solution to elasticsearch and we have some questions about the keyword analyzer. In our "legacy" system we are indexing a field using org.apache.lucene.analysis.KeywordAnalyzer for searching and indexing.

According to the documentation: (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html)

An analyzer of type keyword that “tokenizes” an entire stream as a single token. This is useful for data like zip codes, ids and so on. Note, when using mapping definitions, it make more sense to simply mark the field as not_analyzed.

What advantage does using not_analyzed have over analyzing it as a keyword? Is it functionally equivalent?

--

Hello!

Sorry, you are talking about keyword analyzer, not the tokenizer (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-tokenizer.html). In that Clinton is right.

--

Regards,

Rafał Kuć

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

Hello!

They are not totally equivalent. With keyword analyzer you can add additional filters, like lowercase one for example and have your text as a single term in the index although analyzed a bit. When setting the field to not_analyzed you'll have the field as it is.

--

Regards,

Rafał Kuć

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

My team is in the process of migrating an application from a custom Lucene based solution to elasticsearch and we have some questions about the keyword analyzer. In our "legacy" system we are indexing a field using org.apache.lucene.analysis.KeywordAnalyzer for searching and indexing.

According to the documentation: (http://www.elasticsearch.org/guide/reference/index-modules/analysis/keyword-analyzer.html)

An analyzer of type keyword that “tokenizes” an entire stream as a single token. This is useful for data like zip codes, ids and so on. Note, when using mapping definitions, it make more sense to simply mark the field as not_analyzed.

What advantage does using not_analyzed have over analyzing it as a keyword? Is it functionally equivalent?

--

--