I got quite confused today doing some testing (Java client) around what is
actually stored in the index for a field. Before investing into a REST gist
I just wanted to check whether there is a major misunderstanding on my side:
I have a type mapping with _source = disabled and dynamic = false.
I have index = analyzed and store = no for my field
I index some content in the field
I run a get request for the indexed content and configured the get
request to return the field
I checked that the mapping I defined is really in effect on the index (so
no basic misconfiguration). I expected the field to be null on the get
response (as it is not stored). If not null, I expected at least only the
analyzed version (e.g. lowercase which is my custom test analyzer).
However, what I am getting is always the real (non-analyzed) value? Is
this correct behavior?
I got quite confused today doing some testing (Java client) around what is
actually stored in the index for a field. Before investing into a REST gist
I just wanted to check whether there is a major misunderstanding on my side:
I have a type mapping with _source = disabled and dynamic = false.
I have index = analyzed and store = no for my field
I index some content in the field
I run a get request for the indexed content and configured the get
request to return the field
I checked that the mapping I defined is really in effect on the index (so
no basic misconfiguration). I expected the field to be null on the get
response (as it is not stored). If not null, I expected at least only the
analyzed version (e.g. lowercase which is my custom test analyzer).
However, what I am getting is always the real (non-analyzed) value? Is
this correct behavior?
Awesome! Thanks for getting back so quickly. The background to this
question / test was that I plan to implement a special 'hashing' analyzer
for security relevant fields (e.g. stuff like social security numbers). I
am trying to provide some security for such fields (in case the index gets
compromised / stolen) while still providing some basic (very) search
capabilities on those fields. Obviously, for this you must not be able to
access the original field value (which is why I was so surprised in my test
case).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.