Is there a significant disk usage difference between keeping some data in an array or in an individual field? The data won't be frequently queried but sometimes may act as a filter. So I'm not concerned about search speed in this case, only if I can spare a couple of GBs per X million documents.
Yes. Currently, I have about 5 values which aren't that important (e.g., I don't use scoring). With around 15.000.000.000 documents would I spare disk space if those were in an array (like the tags example)?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.