I want to implement hierarchical facets for our categories data.
Categories was stored with nested set model, so every category has a lft and rgt field. I can index these two fields into
elasticsearch, and use range facet to get count.
Recently I found another way to do the same functionality. Use Path
Hierarchy Tokenizer which elasticsearch exposed to index category path
which like /20/134/7856, and use term facet with regex patterns to get
count.
I'm new to elasticsearch. I have known facets need lots of memory,
term facet will load relevant field values into memory, but still
don't know which one is better.
I want to implement hierarchical facets for our categories data.
Categories was stored with nested set model, so every category has a lft and rgt field. I can index these two fields into
elasticsearch, and use range facet to get count.
Recently I found another way to do the same functionality. Use Path
Hierarchy Tokenizer which elasticsearch exposed to index category path
which like /20/134/7856, and use term facet with regex patterns to get
count.
I'm new to elasticsearch. I have known facets need lots of memory,
term facet will load relevant field values into memory, but still
don't know which one is better.
I want to implement hierarchical facets for our categories data.
Categories was stored with nested set model, so every category has a lft and rgt field. I can index these two fields into
elasticsearch, and use range facet to get count.
Recently I found another way to do the same functionality. Use Path
Hierarchy Tokenizer which elasticsearch exposed to index category path
which like /20/134/7856, and use term facet with regex patterns to get
count.
I'm new to elasticsearch. I have known facets need lots of memory,
term facet will load relevant field values into memory, but still
don't know which one is better.
I'm unclear as to exactly how you're using the terms/range facets, but
facets need to load all the field values for every doc into memory.
The memory usage depends on:
the number of unique values
the number_of_docs * max_values_per_doc
The path hierarchy tokenizer results in multiple values per field,
eg /foo/bar/baz gives you:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.