I have an indexing question related to EES app searching. Btw we are on the cloud offering v8.4.
We are using the API to push documents into the search engine and we are actually sending to EES the whole document we want to be returned by our custom API. But only a couple of the properties/fields of the indexed document are relevant to our search results.
I know there is an option to specify the fields EES should search into, but that doesn't seem to yield the results we think it should and the scoring is really weird.
My questions:
are all the fields being indexed by EES?
is there a way to tell EES which fields to index or not to index? because I'm assuming when it does index all of them then the generated score might be incorrect and that could affect the search result
Hello @lnaie , welcome to the community !
If I understand the requirement correctly, you only want to store some fields from your logs in ES, the best approach for this would be to adjust the logging source (API) which is logging those messages/ audit fields and remove the unwanted fields.
If that's not a viable option, you can update your index template and set "index": false for the fields you don't want to index.
For your questions:
By default, yes all the fields are indexed by ES.
Using "index": false mapping parameter for the fields that should not be indexed or available for search, but are still stored.
There are 2 settings that are similar both are valid but accomplish 2 slightly different things.
"index": falseThis will still parse the fields and store it in the fields structure but it will not be searchable or aggregatable, but it is retrievable as part of fields which can be faster than _source, but it does take some storage.
"enabled": falseThis does not attempt to parse nor does it store the field in the fields structure it is left alone and just remains in the _source it is neither searchable, aggregatable and can not be retrieved from the fields
I'm aware of the differences but more ES features should be implemented in the EES.
I know that removing the fields is the default solution but it's not always the needed one. There should be a way to tell in the schema what fields should be indexed or not or some variation on this idea.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.