I'm taking some time reviewing our mapping.
We've put doc_values for the fields that aggregation will be calculated.
But I'm not sure is there any benefit to do this for boolean field?
This also leads me to wonder how ElasticSearch (or Lucene) index boolean
fields?
Boolean fields are simply indexed as strings: "T" for true and "F" for
false and field data would require about 2 bits per document (one to know
if the document has a value, and one to store the value). That's not much
but it can still use quite some memory if you have lots of memory. If/when
doc values support comes to booleans, this will help move most of this
memory usage to disk and the filesystem cache.
I'm taking some time reviewing our mapping.
We've put doc_values for the fields that aggregation will be calculated.
But I'm not sure is there any benefit to do this for boolean field?
This also leads me to wonder how Elasticsearch (or Lucene) index boolean
fields?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.