I'm indexing some Documents, those documents can have a field or two that I
do not want in the response but I want to be able to search on them (they
should be indexed, not stored).
What is the best way to do that?
What I understand is that:
set _source.enabled to false
set store to true all on the "public" fields
query using fields[*]
Another solution is to query and specify the fields I want explicitly:
query using a long list of fields[title, stuff, lot, of, stuff]
Yet another can be the new (kind of?) "excludes" and "includes" _source
options:
set _source.excludes to [private_data]
I did'nt found lot of documentation about the "excludes" / "includes"
options but they seems to be by far the best option for me,
but what about performance? Sorting?
What are the downside of "excludes" against a fields[] query?
I am doing this right?
I'm indexing some Documents, those documents can have a field or two that
I do not want in the response but I want to be able to search on them (they
should be indexed, not stored).
What is the best way to do that?
What I understand is that:
set _source.enabled to false
set store to true all on the "public" fields
query using fields[*]
Another solution is to query and specify the fields I want explicitly:
query using a long list of fields[title, stuff, lot, of, stuff]
Yet another can be the new (kind of?) "excludes" and "includes" _source
options:
set _source.excludes to [private_data]
I did'nt found lot of documentation about the "excludes" / "includes"
options but they seems to be by far the best option for me,
but what about performance? Sorting?
What are the downside of "excludes" against a fields query?
I am doing this right?
I'm indexing some Documents, those documents can have a field or two
that I do not want in the response but I want to be able to search on
them (they should be indexed, not stored).
What is the best way to do that?
The answer depends on this question: do you want to be able to reindex
your documents using Elasticsearch as the source? In other words, if
you need to change your index (eg you need to make some incompatible
change to your mapping), would you create the new index by pulling data
from the old index, or would you pull your data from some external data
source?
If an external data source, then map your _source field to use
'excludes'.
If you want to use Elasticsearch as the data source, then keep your
_source field intact, and either:
specify the fields you want returned with the fields parameter, or
use partial fields to strip out the parts of the source that you
don't want returned
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.