Field name length and ElasticSearch performance


(andym) #1

Hi,

Will the size of the elastic search index decrease (and performance
increase due to reduce memory footprint) if I were to shorten field
names?

I have many fields in index which have descriptive names (i.e.
“document_type”) and reading docs about other NoSQL databases is
appears that they store field names verbatim and as a result the size
of database greatly varies depending on the length of field name
(especially when number/length of fields are greater their value)

A common suggestion is to shorten the field name (i.e. make
“document_type” into “dt”) to address the problem. Would the same
suggestion apply to ElasticSearch?

Thanks,

-- Andy


(Clinton Gormley) #2

On Wed, 2012-02-08 at 10:36 -0800, andym wrote:

Hi,

Will the size of the elastic search index decrease (and performance
increase due to reduce memory footprint) if I were to shorten field
names?

I have many fields in index which have descriptive names (i.e.
“document_type”) and reading docs about other NoSQL databases is
appears that they store field names verbatim and as a result the size
of database greatly varies depending on the length of field name
(especially when number/length of fields are greater their value)

A common suggestion is to shorten the field name (i.e. make
“document_type” into “dt”) to address the problem. Would the same
suggestion apply to ElasticSearch?

Field names are used in two places:

  1. in the Lucene index
  2. in the stored _source

In the Lucene index, field names are stored once, and the _source can be
compressed. Normally, the amount of storage used for your field names is
dwarfed by the amount of storage used for your data.

Frankly, I think this is a premature optimization, and it makes much
more sense to keep things readable and maintainable

clint


(system) #3