Source field exclusion seems to be storing data anyway


(Randy McCluer) #1

I have some sensitive data that I want excluded from source, but indexed. I
am using "_source": { "excludes": ["field1"] }, and everything seems to be
working just as expected with the source docs coming back without field1.
If I update the mapping to not exclude field1, the docs still return
without field1 as well. However, if I restart the service, they start
coming back with field1 in the doc, indicating that the data was being
stored all along.

All of the documentation I've found indicates that the excluded fields are
removed at write-time. My situation leads me to believe that this isn't the
case. Can someone tell me if this is the expected behavior or a bug to be
filed? I'd really hate to have to go down the Solr-style route of declaring
all of my fields individually.

TIA

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dd420599-3a17-43a5-b5f1-ab552e910a94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Randy McCluer) #2

I posted this over the holidays, so I figured I'd bump it. Anyone else ever
seen this behavior?

On Monday, January 5, 2015 at 11:30:36 AM UTC-6, Randy McCluer wrote:

I have some sensitive data that I want excluded from source, but indexed.
I am using "_source": { "excludes": ["field1"] }, and everything seems to
be working just as expected with the source docs coming back without
field1. If I update the mapping to not exclude field1, the docs still
return without field1 as well. However, if I restart the service, they
start coming back with field1 in the doc, indicating that the data was
being stored all along.

All of the documentation I've found indicates that the excluded fields are
removed at write-time. My situation leads me to believe that this isn't the
case. Can someone tell me if this is the expected behavior or a bug to be
filed? I'd really hate to have to go down the Solr-style route of declaring
all of my fields individually.

TIA

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44a91939-4526-42a6-8085-ae632be40812%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Benjamin Gathmann) #3

Hi, I have just opened a similar question on this topic. From all the posts I have read, Elasticsearch does not seem to behave very reliable in respect to excluding fields or paths from _source and being indexed.: Exclude complete path from indexing and _source
I am curious to see answers to this.


(system) #4