From previous posts I see there might be some help in 0.20 for facet memory
usage. However I'm hoping there is something I can do myself until then,
besides add memory/machines. I'm using elasticsearch as a DB so I have
complete control over my data. I only use facets on a single field (tags)
that is an array of strings.
Would any of the following help
a) I could have an external lookup table and use an array of ints instead
of an array of strings. (This would have a side effect of making documents
smaller too I guess)
b) Some of the tags are in almost every single document, would it be better
to switch those out to a separate fields that are indexed instead of using
facets?
c) reduce the number of replicas (are they each loading things into memory?)
d) Use more or less shards (right now 60 across 10 machines)
e) anything else?
From previous posts I see there might be some help in 0.20 for facet
memory usage. However I'm hoping there is something I can do myself until
then, besides add memory/machines. I'm using elasticsearch as a DB so I
have complete control over my data. I only use facets on a single field
(tags) that is an array of strings.
Would any of the following help
a) I could have an external lookup table and use an array of ints instead
of an array of strings. (This would have a side effect of making documents
smaller too I guess)
b) Some of the tags are in almost every single document, would it be
better to switch those out to a separate fields that are indexed instead of
using facets?
c) reduce the number of replicas (are they each loading things into
memory?)
d) Use more or less shards (right now 60 across 10 machines)
e) anything else?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.