Given the above, is the advice in ES General Recommendations, under heading Avoid Sparsity, still relevant? It's still there in documentation for ES >=6.0, and says things like:
In practice, this means that if an index has M documents,
norms will require M bytes of storage per field, even for fields
that only appear in a small fraction of the documents of the index.
Although slightly more complex with doc values due to the fact that
doc values have multiple ways that they can be encoded depending
on the type of field and on the actual data that the field stores,
the problem is very similar.
Thanks for that. So, you would confirm that the specific bit of explanation why sparsity is bad - the one I quote above from the 6.0 docs - is now obsolete, right? If so, where can I file a request to have it removed?
Submitted https://github.com/elastic/elasticsearch/issues/30833 (Against elastic/elasticsearch, not elastic/docs - the latter says "If you find an error in the documentation, you should open an issue or pull request on the repository which contains the docs. For instance, the elasticsearch docs can be found in the main elasticsearch repository.")
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.