Not analyzed vs Analyzed

(Dave) #1

I basically have data where I'm only going to do database like queries such as where car = 'mustang' (Exact queries). In some cases I might do where car like 'mustan%'

I'm going to do lots of aggregates on the data. I believe I can get away with just Term queries so my thinking is that I can just create all my fields as non-analyzed fields.

I'm just trying to figure out form a performance standpoint which would be better. I've setup one of my indexes with a field that is multi-type and I seem to be getting slightly better performance with non-analyzed. I don't know how that will scale going forward.

Just curious what the pros/cons are of non vs analyzed as it relates to me just really needing to do the "exact" type queries or wildcard queries. I do not need full text search or anything complex. NO special analysis, case sensitivity, etc... Essentially just "SQL" like queries. where a = b or where a like 'b%'

Does one perform better, take more storage, more overhead, etc....?

Thank you

(Ivan Brusic) #2

With Elasticsearch 2.0+, non-analyzed fields will use doc values by
default, which will offer better performance and reduced your Java heap
usage. The downsides is that they use more disk space and slightly
increased indexing time. Using off-heap memory and reducing the size of the
heap is a great improvement.


(system) #3