Type or field


(arta) #1

Hi,
I have only one document type (only one mapping).
The documents are grouped, for example, by zip code.
And our searches are always bound within a group, for example only for 94040.
We do not search on multiple groups.
In this situation, which design is better?:

  1. have the zip code as a type, so that we can search only for that zip code by issueing search query
    "http://localhost:9200//94040/_search" ...
  2. have the zip code as a field, then filter by the field
    "http://localhost:9200///_search" -d '{
    "query":{"filtered":{"query":{..},"filter":{"term":{"zip":94040}}}}}'

Thank you for your help.


(Ivan Brusic) #2

I do not think there is a difference in performance. Mappings are
defined by type, so it would be easier to maintain a mapping against a
fixed type and not need a dynamic mapping across all types in an
index, IMHO.

On Tue, Jun 12, 2012 at 10:16 AM, arta artasano@sbcglobal.net wrote:

Hi,
I have only one document type (only one mapping).
The documents are grouped, for example, by zip code.
And our searches are always bound within a group, for example only for
94040.
We do not search on multiple groups.
In this situation, which design is better?:

  1. have the zip code as a type, so that we can search only for that zip code
    by issueing search query
    "http://localhost:9200//94040/_search" ...
  2. have the zip code as a field, then filter by the field
    "http://localhost:9200///_search" -d '{
    "query":{"filtered":{"query":{..},"filter":{"term":{"zip":94040}}}}}'

Thank you for your help.

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/type-or-field-tp4019143.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(arta) #3

Thank you Ivan,
Good to know there isn't much difference.

Here'are my concerns:

  1. have the zip code as a type, so that we can search only for that zip code by specifying 'type'
    Pros:
  • easy to construct query, as no explicit filter clause is necessary
  • no need to have 'zip_code' field. so presumably saving disk space?
  • type handling is a built-in feature of ES, so presumably faster processing than query filter?
    Cons:
  • I guess, default mapping is copied for each type when a type is added, so more memory consumption?
  1. have the zip code as a field, then apply a filter by the field
    Pros and cons opposite to 1)

In general, I'd like to know the difference in terms of, disk-usage, memory-consumption and search-speed.
Thanks, again, for your help.


(Shay Banon) #4

There isn't a difference in terms of perf or disk store, a type is actually
stored as another field in a doc. You will create multiple mappings in this
case for each zip, which is a waste in terms of storing the cluster state.
So, use a field and filter by it.

On Wed, Jun 13, 2012 at 6:50 PM, arta artasano@sbcglobal.net wrote:

Thank you Ivan,
Good to know there isn't much difference.

Here'are my concerns:

  1. have the zip code as a type, so that we can search only for that zip
    code
    by specifying 'type'
    Pros:
  • easy to construct query, as no explicit filter clause is necessary
  • no need to have 'zip_code' field. so presumably saving disk space?
  • type handling is a built-in feature of ES, so presumably faster
    processing than query filter?
    Cons:
  • I guess, default mapping is copied for each type when a type is added,
    so more memory consumption?
  1. have the zip code as a field, then apply a filter by the field
    Pros and cons opposite to 1)

In general, I'd like to know the difference in terms of, disk-usage,
memory-consumption and search-speed.
Thanks, again, for your help.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/type-or-field-tp4019143p4019220.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(arta) #5

Thank you, kimchy.


(system) #6