How to improve facet performance?

https://groups.google.com/forum/#!topic/elasticsearch/3j_r5uc_7F4

Hello.
I had the OOM problem and resolved by setting option
'index.cache.field.type: soft'.
At now, my query and facet works file.

But I have faced new problem, pool performance.

For example,

Req : {"size":0,"query":{"field":{"follows":"2324"}},"facets":{"
plays":{"terms":{"size":10,"script":"doc['plays.musicId'].values"}}}}
Res : { took : 169612 ... }

For testing, I changed facet field as following.

Req : {"size":0,"query":{"field":{"follows":"2324"}},"facets":{"
plays":{"terms":{"size":10,"script":"doc['plays.count'].values"}}}}
Res : {took : 5 ...}

This change makes performance good dramatically.
The difference between 'musicId' and 'count' is one.

  1. The count value of all docs is 1.(constant)
  2. The musicId value of each doc is between 1 and 50000 and each doc has
    300 musicIds.

There is an way to improve performance?
Please, give me some advice.
Thanks for reading.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hiya

Hello.

I had the OOM problem and resolved by setting option
'index.cache.field.type: soft'.
At now, my query and facet works file.

But I have faced new problem, pool performance.

That's what soft gives you :slight_smile: By using soft references, you haven't
solved the OOM problem, you've just forced ES to reload field data all the
time (which is very heavy).

You can set the indices.fielddata.cache.size size to avoid OOMs, but it's a
safety mechanism, not a solution. If you are running out of memory it'll
evict field data, which will affect performance. See

You need more memory, or more nodes, or fewer facets. If you're faceting on
high cardinality string fields, that's going to use a lot of memory.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Instead of using scripts, can you simply index the eventually value of
plays.count? You indexing process would have to contain more logic, but
this logic is only executed once per document and not each time like it is
during a query.

Can you provide some sample documents?

Cheers,

Ivan

On Fri, Oct 18, 2013 at 10:34 AM, Clinton Gormley clint@traveljury.comwrote:

Hiya

Hello.

I had the OOM problem and resolved by setting option
'index.cache.field.type: soft'.
At now, my query and facet works file.

But I have faced new problem, pool performance.

That's what soft gives you :slight_smile: By using soft references, you haven't
solved the OOM problem, you've just forced ES to reload field data all the
time (which is very heavy).

You can set the indices.fielddata.cache.size size to avoid OOMs, but it's
a safety mechanism, not a solution. If you are running out of memory it'll
evict field data, which will affect performance. See
Elasticsearch Platform — Find real-time answers at scale | Elastic
You need more memory, or more nodes, or fewer facets. If you're faceting
on high cardinality string fields, that's going to use a lot of memory.

clint

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for kindly advice.

I have posted
Redirecting to Google Groups.
This page contains samples and specific queries.

Thanks again!

2013년 10월 19일 토요일 오전 3시 7분 3초 UTC+9, Ivan Brusic 님의 말:

Instead of using scripts, can you simply index the eventually value of
plays.count? You indexing process would have to contain more logic, but
this logic is only executed once per document and not each time like it is
during a query.

Can you provide some sample documents?

Cheers,

Ivan

On Fri, Oct 18, 2013 at 10:34 AM, Clinton Gormley <cl...@traveljury.com<javascript:>

wrote:

Hiya

Hello.

I had the OOM problem and resolved by setting option
'index.cache.field.type: soft'.
At now, my query and facet works file.

But I have faced new problem, pool performance.

That's what soft gives you :slight_smile: By using soft references, you haven't
solved the OOM problem, you've just forced ES to reload field data all the
time (which is very heavy).

You can set the indices.fielddata.cache.size size to avoid OOMs, but it's
a safety mechanism, not a solution. If you are running out of memory it'll
evict field data, which will affect performance. See
Elasticsearch Platform — Find real-time answers at scale | Elastic
You need more memory, or more nodes, or fewer facets. If you're faceting
on high cardinality string fields, that's going to use a lot of memory.

clint

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.