Hi,
I've been testing the facet performance during that last week and at first it seemed like the perfect solution for my need. Now I'm not so sure.
I have 2 indexes, index1 with 50GB and index2 with 160GB (both with below mapping).
each user can have up to 1000 different products (different term values). Each term can reach up to 4M count.
index1 has 15M users & index2 has 65M users (main documents).
according to elasticsearch head:
index1 has a total of docs: 189550849 (276425354)
index2 has a total of docs: 1452655206 (1840196363)
running the below query on index1 takes ~5 seconds to return, which is not that fast as it is but that's for the first time before caching. On the other end, when running on index2 it takes up to 30 seconds which is above my SLA.
Also tried using warmers(see below), but I think that because the dates on facet_filter keep changing, the cache is not that helpful.
I'm using 8 m3.2xlarge nodes on my cluster (8 cores 30g RAM). 20g are allocated for ES.
I can tell from bigdesk, that not all nodes a participating in the facet calculation, and that the cpu usage on those nodes is not that high (up to 30%), but goes on for a relatively long time.
I think that my biggest challenge here is to find the right warmer queries for this task, but I will try anything that can make my queries go faster.
Is the index too big for what I'm trying to do?
Are the documents/sub documents count too big for it?
Is the term values for the product field have to many different values to run the calculation in a reasonable time?
update:
forgot to mention I'm using v0.90.2
Thanks in advanced,
query example:
{
"size": 0,
"facets": {
"tags": {
"terms": {
"field": "productCode",
"size": 1000,
"regex": "PRODUCT\d+"
},
"nested": "products",
"facet_filter": {
"range": {
"products.time": {
"from": "2013-12-01",
"to": "2013-12-31",
"include_lower": true,
"include_upper": true
}
}
}
}
}
}
mapping:
{
"user" : {
"_ttl" : {
"enabled" : true
},
"properties" : {
"products" : {
"type" : "nested",
"properties" : {
"time" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"productCode" : { "type" : "string", "index": "not_analyzed"}
}
}
}
}
}
warmer:
{
"query": {"match_all": {}}
,
"size": 0,
"facets": {
"tags": {
"terms": {
"field": "productCode",
"size": 1000
},
"nested": "products"
}
}
}