I have some data which we want to only index for filtering but not store in _source field. We have used _source exclusion for those fields.
We have a scenario where in certain cases, the document doesn't contain that field even in index and hence those documents are not returned by filter.
It doesn't happen for all documents, but occasionally few documents fail to have that field in index
We enabled client side logs and we use bulk request with upsert for indexing documents which shows the request had the field but elastic still didn't contain that field in index.
We enabled server side logs as well but we don't see any silent failure error.
Can you please share if there is anything we are missing.
The _source field must be enabled to use update . In addition to _source , you can access the following variables through the ctx map: _index , _type , _id , _version , _routing , and _now (the current timestamp).
Missing fields for updates where source is disabled is therefore probably expected to result in missing fields.
Thats why i have a confusion. First time it doesn't keep in index but when we resend the same using same Bulk request, it does index it. How is that possible then?
We use UpdateRequest upsert method in Bulk Request which sometimes keep the field in index and sometimes doesn't. This intermittency is not clear.
And it doesnt happen with all the documents, this intermittency drops the field from some documents which after a resend fixes it.
That I do not know. I would expect it to fail consistently. Is it possible the existing document sometimes have the same value you are updating to so it just looks like it somethimes succeeded?
As it is mentioned in the documentation that it is not supported I would recommend against doing this (or enable source).
Existing document never has the field with same values and we validated that with a date field. So its always a changed value which should have been indexed.
Yeah I understand that in this scenario we will have to enable it. But I am not able to wrap my head around intermittency.
If you can create a minimal example that can be run from the Kibana dev console or curl that reliably reproduces the issue someone can have a look at it, but without that it is difficult.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.