Size of Field Cache built/loaded into memory with respect to query executed

Hi,
I am trying to understand how and when field cache is built and loaded into
memory.
Does the field cache is built and loaded as soon as any single query is
performed?

In the following queries, how much field cache is built and loaded into
memory? In other words, does es loads a subset of field cache in memory
depending upon what matches query or all at the same time?

  1. Query with match_all set and with no filters.
  2. Query with match_all set and with few filters.
  3. Constant score Query with few filters. something like "query" : {
    "constant_score" : { "filter" : { bunch of filters here} } }
  4. Query which hits a subset of documents and few filters.

Thanks!
Vinay

--

Hello Vinay,

AFAIK, field caches are used when sorting or faceting. And none of the
scenarios you mentioned seems to use that.

Instead, I see you have some filters there. Are you interested in filter
caches?

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 3, 2013 at 10:09 PM, revdev clickingcam@gmail.com wrote:

Hi,
I am trying to understand how and when field cache is built and loaded
into memory.
Does the field cache is built and loaded as soon as any single query is
performed?

In the following queries, how much field cache is built and loaded into
memory? In other words, does es loads a subset of field cache in memory
depending upon what matches query or all at the same time?

  1. Query with match_all set and with no filters.
  2. Query with match_all set and with few filters.
  3. Constant score Query with few filters. something like "query" : {
    "constant_score" : { "filter" : { bunch of filters here} } }
  4. Query which hits a subset of documents and few filters.

Thanks!
Vinay

--

--

Yes, field caches are loaded for sorting, faceting and when a script is
accessing a document field using constructs such as doc['field_name'].

On Friday, January 4, 2013 2:45:51 AM UTC-5, Radu Gheorghe wrote:

Hello Vinay,

AFAIK, field caches are used when sorting or faceting. And none of the
scenarios you mentioned seems to use that.

Instead, I see you have some filters there. Are you interested in filter
caches?

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 3, 2013 at 10:09 PM, revdev <click...@gmail.com <javascript:>>wrote:

Hi,
I am trying to understand how and when field cache is built and loaded
into memory.
Does the field cache is built and loaded as soon as any single query is
performed?

In the following queries, how much field cache is built and loaded into
memory? In other words, does es loads a subset of field cache in memory
depending upon what matches query or all at the same time?

  1. Query with match_all set and with no filters.
  2. Query with match_all set and with few filters.
  3. Constant score Query with few filters. something like "query" : {
    "constant_score" : { "filter" : { bunch of filters here} } }
  4. Query which hits a subset of documents and few filters.

Thanks!
Vinay

--

--

Radu, Igor, Thanks for the reply.
Sorry, I was not clear with my query (pun intended).
In the scenarios I described, I am using those queries/filters to get
facets on various fields.
So, my question is about when is field cache built and does it depend upon
the query or filters used? or is it built completely in one shot for all
documents and all fields?

Secondly, to calculate facets I am using a constant score query with filter
( "query" : { "constant_score" : { "filter" : { bunch of filters here} }
}). I am wondering if this approach is less/more efficient speed-wise as
compared to using a mix of query and filter where query hits a subset of
documents and filter is run on that subset.

Thanks again!
Vinay

On Friday, January 4, 2013 5:30:13 AM UTC-8, Igor Motov wrote:

Yes, field caches are loaded for sorting, faceting and when a script is
accessing a document field using constructs such as doc['field_name'].

On Friday, January 4, 2013 2:45:51 AM UTC-5, Radu Gheorghe wrote:

Hello Vinay,

AFAIK, field caches are used when sorting or faceting. And none of the
scenarios you mentioned seems to use that.

Instead, I see you have some filters there. Are you interested in filter
caches?

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Thu, Jan 3, 2013 at 10:09 PM, revdev click...@gmail.com wrote:

Hi,
I am trying to understand how and when field cache is built and loaded
into memory.
Does the field cache is built and loaded as soon as any single query is
performed?

In the following queries, how much field cache is built and loaded into
memory? In other words, does es loads a subset of field cache in memory
depending upon what matches query or all at the same time?

  1. Query with match_all set and with no filters.
  2. Query with match_all set and with few filters.
  3. Constant score Query with few filters. something like "query" : {
    "constant_score" : { "filter" : { bunch of filters here} } }
  4. Query which hits a subset of documents and few filters.

Thanks!
Vinay

--

--

Hello Vinay,

On Fri, Jan 4, 2013 at 7:46 PM, revdev clickingcam@gmail.com wrote:

Radu, Igor, Thanks for the reply.
Sorry, I was not clear with my query (pun intended).
In the scenarios I described, I am using those queries/filters to get
facets on various fields.
So, my question is about when is field cache built and does it depend upon
the query or filters used? or is it built completely in one shot for all
documents and all fields?

My understanding is that when ES does faceting, it loads the relevant
fields (of the relevant documents) in the field cache. So if different
queries return different sets of results - the load on your field cache
should be different.

Secondly, to calculate facets I am using a constant score query with
filter ( "query" : { "constant_score" : { "filter" : { bunch of filters
here} } }). I am wondering if this approach is less/more efficient
speed-wise as compared to using a mix of query and filter where query hits
a subset of documents and filter is run on that subset.

Filters on their own benefit from filter caches, so they should be faster.
But please note that facets are run on query results and ignore the
filters. So if you need to account for filters when you run facets, you
need to either wrap them in the constant_score query or use facet_filter.

As far as I know, facet filters are the faster of the two. But again, they
would only apply to facets, not to search results. So if you want the same
filter to be applied on both, you'd have to stick with constant_score.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Thanks a lot Radu! That answers my questions!

On Sat, Jan 5, 2013 at 7:53 AM, Radu Gheorghe radu.gheorghe@sematext.comwrote:

constant_score

--