Adding to Benjamin Response inline:
On Mon, Sep 12, 2011 at 11:52 AM, Per Steffensen email@example.com:
Shay Banon skrev:
There is no support for unique constraints (and probably won't be because
of both the limitation of distributed notion, and non real time search). You
can't have several docs with the same _id on the other hand, and you can
actually index a document with a "create" op_type, which will cause the
indexing to fail if there is already a document indexed under the same _id.
Thanks, Shay. Then in my world there IS unique constraint support - but
only on _id field (no user-defined unique constaints). Can you say something
about the "scope" of that unique constraint on _id - is it per index or only
per shard in the index? If it is per index, I guess the feature will
actually be a scalability-limit, not allowing ES to scale "to infinity" (but
probably very very far) with respect to "number of nodes involved in serving
an specific index". But maybe not, can you say a little more about how it is
implemented, with respect to the communication needed amoung nodes running
shards in the index, in order to maintain the unique constaint on _id's in
the entire index?
A document unique id is the tuple its type and id. Since a document can't
exists in two shards at the same time, the scope is "index" wise but the
check is shard wise.
You say that I need to add a "create" op_type in order to make the index
operation fail if it violates the unique constaint on _id. I would expect
the index operation to fail anyway - what other possible outcome is there
when an index operation violates the _id unique constaint? What happens if I
try to index a new document with an _id that is already used by an existing
document in the index, and I do not add the "create" op_type thing that you
Updating the document.
On Mon, Sep 12, 2011 at 10:39 AM, Per Steffensen firstname.lastname@example.org:
Are there some way of enforcing unique constraints on documents in ES.
E.g. saying to ES that maximum one document are allowed in an index where
field "key1" and "key2" have simular values. E.g. on the structure of data
in the example on
http://www.elasticsearch.org/guide/reference/api/index_.html, can I
somehow tell ES that if a document with "user=kimchy" and
"post_date=2009-11-15T14:12:12" already have been indexed into the index,
then no other documents with the exact same values for user and post_date
are allowed to be indexed?
It would be nice with such a unique constraint feature across an index,
but to avoid communication overhead among nodes the feature might only work
on the same shard within the specific index (then it is up to the
applicaiton using ES to make sure the documents that might collide with
respect to unique constraint will be routed to the same shard). Are the any
support for unique constraints - on index-level or at least on shard-level?
Is the "id" of documents at least under unique constraint limitations? Or
are you allowed to have more documents in the same index (or shard) with the
Regards, Per Steffensen