Hints for reducing facet memory?


(Andy Wick) #1

From previous posts I see there might be some help in 0.20 for facet memory
usage. However I'm hoping there is something I can do myself until then,
besides add memory/machines. I'm using elasticsearch as a DB so I have
complete control over my data. I only use facets on a single field (tags)
that is an array of strings.

    "tags": {
      "type": "string",
      "index": "not_analyzed"
    }

Would any of the following help
a) I could have an external lookup table and use an array of ints instead
of an array of strings. (This would have a side effect of making documents
smaller too I guess)
b) Some of the tags are in almost every single document, would it be better
to switch those out to a separate fields that are indexed instead of using
facets?
c) reduce the number of replicas (are they each loading things into memory?)
d) Use more or less shards (right now 60 across 10 machines)
e) anything else? :slight_smile:

Thanks,
Andy


(Shay Banon) #2

Moving to use ints for tags instead of the full string representation will
help a lot memory wise, and reducing the number of shards will also help.

On Fri, Apr 20, 2012 at 7:17 PM, Andy Wick andywick@gmail.com wrote:

From previous posts I see there might be some help in 0.20 for facet
memory usage. However I'm hoping there is something I can do myself until
then, besides add memory/machines. I'm using elasticsearch as a DB so I
have complete control over my data. I only use facets on a single field
(tags) that is an array of strings.

    "tags": {
      "type": "string",
      "index": "not_analyzed"
    }

Would any of the following help
a) I could have an external lookup table and use an array of ints instead
of an array of strings. (This would have a side effect of making documents
smaller too I guess)
b) Some of the tags are in almost every single document, would it be
better to switch those out to a separate fields that are indexed instead of
using facets?
c) reduce the number of replicas (are they each loading things into
memory?)
d) Use more or less shards (right now 60 across 10 machines)
e) anything else? :slight_smile:

Thanks,
Andy


(system) #3