Thousands of fields in a mapping type?

aaron · May 6, 2012, 6:53pm

Would it be a bad idea to define a universal type containing thousands,
perhaps tens of thousands, of fields within a single elasticsearch Mapping?

The documents I store for a given index would use a subset of the fields in
the mapping but for management reasons it would be convenient to have a
universal mapping type that I could use for all indexes.

Ivan · May 7, 2012, 12:36am

I have never explored the impact on performance with thousands of
types (not even hundreds), but you can override the default behavior
of types or use dynamic templates to define a universal type:

It depends of you want to override the defaults based on type (string,
boolean, etc...) or field name. I believe I have read at some point
that too many types does have an impact on performance.

--
Ivan

On Sun, May 6, 2012 at 11:53 AM, aaron atdixon@gmail.com wrote:

Would it be a bad idea to define a universal type containing thousands,
perhaps tens of thousands, of fields within a single elasticsearch Mapping?

The documents I store for a given index would use a subset of the fields in
the mapping but for management reasons it would be convenient to have a
universal mapping type that I could use for all indexes.

kimchy · May 9, 2012, 8:55am

Each field comes with an overhead of memory usage (on the Lucene level),
theoretically its possible, but you will need to check the memory (sadly,
the memory used by Lucene is not exposed to be reported by ES).

On Sun, May 6, 2012 at 9:53 PM, aaron atdixon@gmail.com wrote:

Would it be a bad idea to define a universal type containing thousands,
perhaps tens of thousands, of fields within a single elasticsearch Mapping?

The documents I store for a given index would use a subset of the fields
in the mapping but for management reasons it would be convenient to have a
universal mapping type that I could use for all indexes.

Bruce_Lysik · February 11, 2013, 5:22pm

On Wednesday, May 9, 2012 1:55:32 AM UTC-7, kimchy wrote:

Each field comes with an overhead of memory usage (on the Lucene level),
theoretically its possible, but you will need to check the memory (sadly,
the memory used by Lucene is not exposed to be reported by ES).

How many fields is excessive? Currently I'm storing syslog and apache
logs, but I'm considering giving the development teams to log anything in
json format, which would let them create any fields they please.

Is 100 fields too many? Or 1000? Any guidance would be appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

radu_gheorghe · February 11, 2013, 8:21pm

Hello Bruce,

Nice to meet you again Although you got some of my opinion on the
subject on the Logstash ML, here's a second shot:

On Mon, Feb 11, 2013 at 7:22 PM, Bruce Lysik blysik@yahoo.com wrote:

On Wednesday, May 9, 2012 1:55:32 AM UTC-7, kimchy wrote:

Each field comes with an overhead of memory usage (on the Lucene level),
theoretically its possible, but you will need to check the memory (sadly,
the memory used by Lucene is not exposed to be reported by ES).

How many fields is excessive? Currently I'm storing syslog and apache
logs, but I'm considering giving the development teams to log anything in
json format, which would let them create any fields they please.

In production you probably want to have control over those fields, instead
of relying on ES to detect field types for you. One reason for that is if
the first log in an index accidentally contains an integer in a field that
would normally be string, you'll get indexing errors for pretty much all
the other logs that day.

How many fields do you expect to get in total?

Is 100 fields too many? Or 1000? Any guidance would be appreciated.

Maybe someone else can pop in with some benchmarks, but the sound of 100
and 1000 fields doesn't seem too many to me.

Of course, the definition of excessive will depend on quite some factors,
like:

how much memory you have
how many mapping types you have per index (you'll have to add them up to
get the total number of fields per Lucene index)
what search performance you're expecting (which in turn depends on how
much data you have in an index, how many indices, shards...)

So if you want to make sure, you can run a performance test with your
worst-case scenario and see if all goes well. I'd also recommend monitoring
your cluster during the test, to see what limits you're approaching. There
are quite a lot of nice tools out there for monitoring ES, one of which is
ours:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Recommended maximum fields per index Elasticsearch	10	6471	July 6, 2017
Mapping Size limitations Elasticsearch	10	1067	July 6, 2017
The number of types a index can handle Elasticsearch	3	367	July 6, 2017
Maximum number of fields in an index mapping Elasticsearch	3	8872	July 20, 2017
Caveats of a large mapping Elasticsearch	3	459	July 6, 2017

Thousands of fields in a mapping type?

Best regards, Radu

Related topics

Best regards,
Radu