Thanks very much, Jörg, for your answer! I see the approach...
I understand well that having integrity checks in a schema-less engine
like Elasticsearch isn't possible. However, would it be possible to have
checks at field structure level before triggering indexing in the ES
engine. Perhaps with specifying something like that in the mapping:
{
"mappings": {
"mydoc": {
"properties": {
(...)
"name": {
"type":"string", "store":"yes", "index":"analyzed",
"checks": "not_null,not_empty,regexp=$"
},
(...)
}
}
}
}
Thanks very much for your help!
Thierry
You can validate the data at client side in your model before
serializing it to JSON, or after a complete bulk index run.
There are reasons why Elasticsarch is schema-less. It is equivalent to
allow any number of different fields (keys) and any content in fields
(values) without any logical constraints.
In a distributed system, commits per field, or transactions per field,
or integrity checking can get very expensive. Because the index is
inverted, and nodes can come and go, there is a significant penalty if
you want document transaction safety and document integrity checks.
I validate data in ES with the help o a large scan/scroll over the
docs after bulk indexing, by searching for IDs if they exist or not.
This is different from integrity constraint checking techniques like
rule based methos known from RDBMs.
Jörg
On Sat, Feb 15, 2014 at 10:40 PM, Thierry Templier <templth@gmail.com
mailto:templth@gmail.com> wrote:
Hello,
I wonder if there is a built-in way to validate data before
indexing them. I see two kinds of validation:
* Structural validation of fields based on a regular expression
for example. Perhaps something can be configured in the mapping...
* Integrity validation of document. For example preventing from
indexing a document with a field value that already exists.
In the case where there is no built-in support at the moment, is
there a way to extend ElasticSearch to add such processing before
indexing using the standard REST calls?
Thanks very much for your help!
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/52FFDEC9.7020007%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGXXQxQ%2B5PRwrHw33uj3%2B8WwqLKiAZvnQrZ8bYUMfKYSw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5301B2C8.1060305%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.