Dynamic schema from NoSQL river

We're exploring using a NoSQL system (CouchDB, Couchbase, etc) as our
primary data store and ES with a river plugin for search in a multitenant
system where each user can define their own set of custom fields. NoSQL
does well with this, but I'm concerned about ES indexing these custom
fields.

We might have user_1 create a custom field called "field1" that stores a
date data type, and user_2 create a field with the same name that stores an
integer data type.

Would it make sense to force the data type to be part of the field name?
Like date_field1, etc? It seems like ES might have an easier time with
this.

As a second concern - will ES have issues with the # of unique field names?
Over time the index could have thousands of unique field names if we give
our users the option of defining their own field names.

Creating a separate index per user will not work considering the scale
we're after.

Nick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Instead of thinking creating an index per user, I would think about creating a type per user.
Each type has its own mapping so you can mix things.

Using also templates, you can apply a common pattern mapping to each new user and let the user add its own new field.

Does it help?

Le 1 févr. 2013 à 05:40, Nick Wood nwood888@gmail.com a écrit :

We're exploring using a NoSQL system (CouchDB, Couchbase, etc) as our primary data store and ES with a river plugin for search in a multitenant system where each user can define their own set of custom fields. NoSQL does well with this, but I'm concerned about ES indexing these custom fields.

We might have user_1 create a custom field called "field1" that stores a date data type, and user_2 create a field with the same name that stores an integer data type.

Would it make sense to force the data type to be part of the field name? Like date_field1, etc? It seems like ES might have an easier time with this.

As a second concern - will ES have issues with the # of unique field names? Over time the index could have thousands of unique field names if we give our users the option of defining their own field names.

Creating a separate index per user will not work considering the scale we're after.

Nick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you David.

From a performance standpoint, it makes sense that we wouldn't want to
create an index per user. Will adding a type per user cause any issues if
there are, say 1 million different types?

"Each type has it's own mapping so you can mix things"

  • Are you sure this is the case? I remember reading somewhere that even
    between types, fields with the same name that are different types can cause
    problems. I can't for the life of me find the post that said this, so
    maybe I was dreaming :slight_smile:

Two more somewhat related questions:

  1. Is there a way to specify the type without requiring that it be the
    "root" element in the json object? I ask because it would be cleaner and
    easier for us to put the type in a regular field if we can somehow define
    it's location through mapping.

  2. When you refer to using templates, are you talking about the
    dynamic_templates section of this page -
    Elasticsearch Platform — Find real-time answers at scale | Elastic?
    Is there a better place to learn about templates because I read that
    and
    it's still not obvious how to use them.

Thanks again for your help!

Nick

On Fri, Feb 1, 2013 at 12:49 AM, David Pilato david@pilato.fr wrote:

Instead of thinking creating an index per user, I would think about
creating a type per user.
Each type has its own mapping so you can mix things.

Using also templates, you can apply a common pattern mapping to each new
user and let the user add its own new field.

Does it help?

Le 1 févr. 2013 à 05:40, Nick Wood nwood888@gmail.com a écrit :

We're exploring using a NoSQL system (CouchDB, Couchbase, etc) as our
primary data store and ES with a river plugin for search in a multitenant
system where each user can define their own set of custom fields. NoSQL
does well with this, but I'm concerned about ES indexing these custom
fields.

We might have user_1 create a custom field called "field1" that stores a
date data type, and user_2 create a field with the same name that stores an
integer data type.

Would it make sense to force the data type to be part of the field name?
Like date_field1, etc? It seems like ES might have an easier time with
this.

As a second concern - will ES have issues with the # of unique field
names? Over time the index could have thousands of unique field names if
we give our users the option of defining their own field names.

Creating a separate index per user will not work considering the scale
we're after.

Nick

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.