Clear single field cache


(Sebastian Gavarini) #1

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Shay Banon) #2

I think there is a confusion here... . There is the filter cache, which is used when using certain filters in different places when searching, and field cache, which is used for things like range faceting, date histogram and the like. I did not understand which one are you going to use in order to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache, but can be added. Though, in your case (if you are going to use it), it does not make sense to clear it, since you will still use it next time around. Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly. They do get cleared when memory becomes scarce.

-shay.banon
On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Sebastian Gavarini) #3

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds
of the filter the user selected, eg: if the user selects "yesterday", I
would calculate, based on the date today, the milliseconds that corresponds
to "yesterday". I think a cached filter would be appropriate because until
the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my
field "publish_date", one for each possible value {today, yesterday, past
week, past month}.

I would like to clean all the range filters associated with the
field "publish_date".

Specific filters make less sense, since they are so discrete you would
want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values
as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.comwrote:

I think there is a confusion here... . There is the filter cache, which
is used when using certain filters in different places when searching, and
field cache, which is used for things like range faceting, date histogram
and the like. I did not understand which one are you going to use in order
to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache,
but can be added. Though, in your case (if you are going to use it), it does
not make sense to clear it, since you will still use it next time around.
Specific filters make less sense, since they are so discrete you would want
to have some sort of way to group them possibly. They do get cleared when
memory becomes scarce.

-shay.banon

On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Shay Banon) #4

If it is filters, then they will get cleared out when memory get scarce. There are other options (not documented yet, I should really document it), to control the maximum number of filters that are allowed to be cached: https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721.

The reason why its harder to say something clear all filters that match X is because filters are very different from one another, so its more difficult to know what group them and then expose an API to do that. Even if you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we can think harder on how to do that :slight_smile:
On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds of the filter the user selected, eg: if the user selects "yesterday", I would calculate, based on the date today, the milliseconds that corresponds to "yesterday". I think a cached filter would be appropriate because until the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my field "publish_date", one for each possible value {today, yesterday, past week, past month}.

I would like to clean all the range filters associated with the field "publish_date".

Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.com wrote:

I think there is a confusion here... . There is the filter cache, which is used when using certain filters in different places when searching, and field cache, which is used for things like range faceting, date histogram and the like. I did not understand which one are you going to use in order to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache, but can be added. Though, in your case (if you are going to use it), it does not make sense to clear it, since you will still use it next time around. Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly. They do get cleared when memory becomes scarce.

-shay.banon
On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Sebastian Gavarini) #5

Agreed about difficulty to clear a filter in an API friendly way.

I think selecting the size of the filter cache would be fine.

Just one more thing, how do you know (in advance for sizing and in runtime
to monitor) the number of filters used? I know the ones I request from my
searches, but are some others created implicitly inside ES? is there an API
call to find out the size of that cache (in items quantity, not bytes)?

On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon shay.banon@elasticsearch.comwrote:

If it is filters, then they will get cleared out when memory get scarce.
There are other options (not documented yet, I should really document it),
to control the maximum number of filters that are allowed to be cached:
https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721
.

The reason why its harder to say something clear all filters that match X
is because filters are very different from one another, so its more
difficult to know what group them and then expose an API to do that. Even if
you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we
can think harder on how to do that :slight_smile:

On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds
of the filter the user selected, eg: if the user selects "yesterday", I
would calculate, based on the date today, the milliseconds that corresponds
to "yesterday". I think a cached filter would be appropriate because until
the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my
field "publish_date", one for each possible value {today, yesterday, past
week, past month}.

I would like to clean all the range filters associated with the
field "publish_date".

Specific filters make less sense, since they are so discrete you would
want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values
as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.comwrote:

I think there is a confusion here... . There is the filter cache, which
is used when using certain filters in different places when searching, and
field cache, which is used for things like range faceting, date histogram
and the like. I did not understand which one are you going to use in order
to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache,
but can be added. Though, in your case (if you are going to use it), it does
not make sense to clear it, since you will still use it next time around.
Specific filters make less sense, since they are so discrete you would want
to have some sort of way to group them possibly. They do get cleared when
memory becomes scarce.

-shay.banon

On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Shay Banon) #6

On Tuesday, March 22, 2011 at 1:25 AM, Sebastian Gavarini wrote:
Agreed about difficulty to clear a filter in an API friendly way.

I think selecting the size of the filter cache would be fine.

Just one more thing, how do you know (in advance for sizing and in runtime to monitor) the number of filters used? I know the ones I request from my searches, but are some others created implicitly inside ES? is there an API call to find out the size of that cache (in items quantity, not bytes)?
The most common other place where elasticsearch uses filters is when using type level searches (the type is used as a term filter). But, I agree, we can enhance the stats API to include counts as well, care to open a feature request for that?

On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon shay.banon@elasticsearch.com wrote:

If it is filters, then they will get cleared out when memory get scarce. There are other options (not documented yet, I should really document it), to control the maximum number of filters that are allowed to be cached: https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721.

The reason why its harder to say something clear all filters that match X is because filters are very different from one another, so its more difficult to know what group them and then expose an API to do that. Even if you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we can think harder on how to do that :slight_smile:
On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds of the filter the user selected, eg: if the user selects "yesterday", I would calculate, based on the date today, the milliseconds that corresponds to "yesterday". I think a cached filter would be appropriate because until the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my field "publish_date", one for each possible value {today, yesterday, past week, past month}.

I would like to clean all the range filters associated with the field "publish_date".

Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.com wrote:

I think there is a confusion here... . There is the filter cache, which is used when using certain filters in different places when searching, and field cache, which is used for things like range faceting, date histogram and the like. I did not understand which one are you going to use in order to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache, but can be added. Though, in your case (if you are going to use it), it does not make sense to clear it, since you will still use it next time around. Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly. They do get cleared when memory becomes scarce.

-shay.banon
On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Sebastian Gavarini) #7

As I was typing the feature request, I had a thought and I think it won't
do, but not sure though.

Here's the thing, the filter caches are either LRU, Soft or Weak (these last
two use some sort of eviction based on LRU as stated in the docs, but not
sure exactly the algorithm), but none of them evict on idle time, so the
count API would grow up to the maximum and stay there, bringing little use
as a monitor tool. Maybe if the GC decides to end some Soft/Weak references
it would decrement, but no guarantee.

I think the whole thing won't add much information, maybe the best way to
find the right running footprint would be an LRU with active eviction based
on idle access time, but it's probably more complex to implement and not
sure if it adds real value.

Maybe adding to the "stats" API not only "count" (returns the current cache
size) but also "evictedCount" (or something like that, which counts the
number of evicted filters). So with a couple of calls to the API, separated
in time, you could understand the rate of creation/eviction of filters,
according to your use case (in my case once a day I would create 4 new
filters based on publish_date).

Is this correct? Do I add a feature request for those two stats?

On Mon, Mar 21, 2011 at 8:28 PM, Shay Banon shay.banon@elasticsearch.comwrote:

On Tuesday, March 22, 2011 at 1:25 AM, Sebastian Gavarini wrote:

Agreed about difficulty to clear a filter in an API friendly way.

I think selecting the size of the filter cache would be fine.

Just one more thing, how do you know (in advance for sizing and in runtime
to monitor) the number of filters used? I know the ones I request from my
searches, but are some others created implicitly inside ES? is there an API
call to find out the size of that cache (in items quantity, not bytes)?

The most common other place where elasticsearch uses filters is when using
type level searches (the type is used as a term filter). But, I agree, we
can enhance the stats API to include counts as well, care to open a feature
request for that?

On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon shay.banon@elasticsearch.comwrote:

If it is filters, then they will get cleared out when memory get scarce.
There are other options (not documented yet, I should really document it),
to control the maximum number of filters that are allowed to be cached:
https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721
.

The reason why its harder to say something clear all filters that match X
is because filters are very different from one another, so its more
difficult to know what group them and then expose an API to do that. Even if
you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we
can think harder on how to do that :slight_smile:

On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds
of the filter the user selected, eg: if the user selects "yesterday", I
would calculate, based on the date today, the milliseconds that corresponds
to "yesterday". I think a cached filter would be appropriate because until
the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my
field "publish_date", one for each possible value {today, yesterday, past
week, past month}.

I would like to clean all the range filters associated with the
field "publish_date".

Specific filters make less sense, since they are so discrete you would
want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values
as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.comwrote:

I think there is a confusion here... . There is the filter cache, which
is used when using certain filters in different places when searching, and
field cache, which is used for things like range faceting, date histogram
and the like. I did not understand which one are you going to use in order
to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache,
but can be added. Though, in your case (if you are going to use it), it does
not make sense to clear it, since you will still use it next time around.
Specific filters make less sense, since they are so discrete you would want
to have some sort of way to group them possibly. They do get cleared when
memory becomes scarce.

-shay.banon

On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Shay Banon) #8

Need to check the eviction listener thingy, but, adding an option to expire (after time, last access based) is certainly something that can be added. Open a feature request for it.
On Tuesday, March 22, 2011 at 1:52 AM, Sebastian Gavarini wrote:

As I was typing the feature request, I had a thought and I think it won't do, but not sure though.

Here's the thing, the filter caches are either LRU, Soft or Weak (these last two use some sort of eviction based on LRU as stated in the docs, but not sure exactly the algorithm), but none of them evict on idle time, so the count API would grow up to the maximum and stay there, bringing little use as a monitor tool. Maybe if the GC decides to end some Soft/Weak references it would decrement, but no guarantee.

I think the whole thing won't add much information, maybe the best way to find the right running footprint would be an LRU with active eviction based on idle access time, but it's probably more complex to implement and not sure if it adds real value.

Maybe adding to the "stats" API not only "count" (returns the current cache size) but also "evictedCount" (or something like that, which counts the number of evicted filters). So with a couple of calls to the API, separated in time, you could understand the rate of creation/eviction of filters, according to your use case (in my case once a day I would create 4 new filters based on publish_date).

Is this correct? Do I add a feature request for those two stats?

On Mon, Mar 21, 2011 at 8:28 PM, Shay Banon shay.banon@elasticsearch.com wrote:

On Tuesday, March 22, 2011 at 1:25 AM, Sebastian Gavarini wrote:

Agreed about difficulty to clear a filter in an API friendly way.

I think selecting the size of the filter cache would be fine.

Just one more thing, how do you know (in advance for sizing and in runtime to monitor) the number of filters used? I know the ones I request from my searches, but are some others created implicitly inside ES? is there an API call to find out the size of that cache (in items quantity, not bytes)?
The most common other place where elasticsearch uses filters is when using type level searches (the type is used as a term filter). But, I agree, we can enhance the stats API to include counts as well, care to open a feature request for that?

On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon shay.banon@elasticsearch.com wrote:

If it is filters, then they will get cleared out when memory get scarce. There are other options (not documented yet, I should really document it), to control the maximum number of filters that are allowed to be cached: https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721.

The reason why its harder to say something clear all filters that match X is because filters are very different from one another, so its more difficult to know what group them and then expose an API to do that. Even if you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we can think harder on how to do that :slight_smile:
On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch milliseconds of the filter the user selected, eg: if the user selects "yesterday", I would calculate, based on the date today, the milliseconds that corresponds to "yesterday". I think a cached filter would be appropriate because until the day changes at 24:00, I could keep using the cached filters.

I understand that there would be many filters associated with my field "publish_date", one for each possible value {today, yesterday, past week, past month}.

I would like to clean all the range filters associated with the field "publish_date".

Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible values as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon shay.banon@elasticsearch.com wrote:

I think there is a confusion here... . There is the filter cache, which is used when using certain filters in different places when searching, and field cache, which is used for things like range faceting, date histogram and the like. I did not understand which one are you going to use in order to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache, but can be added. Though, in your case (if you are going to use it), it does not make sense to clear it, since you will still use it next time around. Specific filters make less sense, since they are so discrete you would want to have some sort of way to group them possibly. They do get cleared when memory becomes scarce.

-shay.banon
On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(Sebastian Gavarini) #9

Done,

Stats extension:

Last access time based expiry caches:

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 8:56 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Need to check the eviction listener thingy, but, adding an option to
expire (after time, last access based) is certainly something that can be
added. Open a feature request for it.

On Tuesday, March 22, 2011 at 1:52 AM, Sebastian Gavarini wrote:

As I was typing the feature request, I had a thought and I think it won't
do, but not sure though.

Here's the thing, the filter caches are either LRU, Soft or Weak (these
last two use some sort of eviction based on LRU as stated in the docs, but
not sure exactly the algorithm), but none of them evict on idle time, so the
count API would grow up to the maximum and stay there, bringing little use
as a monitor tool. Maybe if the GC decides to end some Soft/Weak references
it would decrement, but no guarantee.

I think the whole thing won't add much information, maybe the best way to
find the right running footprint would be an LRU with active eviction based
on idle access time, but it's probably more complex to implement and not
sure if it adds real value.

Maybe adding to the "stats" API not only "count" (returns the current cache
size) but also "evictedCount" (or something like that, which counts the
number of evicted filters). So with a couple of calls to the API, separated
in time, you could understand the rate of creation/eviction of filters,
according to your use case (in my case once a day I would create 4 new
filters based on publish_date).

Is this correct? Do I add a feature request for those two stats?

On Mon, Mar 21, 2011 at 8:28 PM, Shay Banon shay.banon@elasticsearch.comwrote:

On Tuesday, March 22, 2011 at 1:25 AM, Sebastian Gavarini wrote:

Agreed about difficulty to clear a filter in an API friendly way.

I think selecting the size of the filter cache would be fine.

Just one more thing, how do you know (in advance for sizing and in runtime
to monitor) the number of filters used? I know the ones I request from my
searches, but are some others created implicitly inside ES? is there an API
call to find out the size of that cache (in items quantity, not bytes)?

The most common other place where elasticsearch uses filters is when using
type level searches (the type is used as a term filter). But, I agree, we
can enhance the stats API to include counts as well, care to open a feature
request for that?

On Mon, Mar 21, 2011 at 7:30 PM, Shay Banon <shay.banon@elasticsearch.com

wrote:

If it is filters, then they will get cleared out when memory get scarce.
There are other options (not documented yet, I should really document it),
to control the maximum number of filters that are allowed to be cached:
https://github.com/elasticsearch/elasticsearch/issues/closed?page=2#issue/721
.

The reason why its harder to say something clear all filters that match X
is because filters are very different from one another, so its more
difficult to know what group them and then expose an API to do that. Even if
you specify the field that they are used on.

I don't think you will need this option, but, if you see that you do, we
can think harder on how to do that :slight_smile:

On Tuesday, March 22, 2011 at 12:24 AM, Sebastian Gavarini wrote:

Hi Shay,

Sorry for the confusion, I'll try to clarify it.

In my case I was planning to use "rangeFilter", with the epoch
milliseconds of the filter the user selected, eg: if the user selects
"yesterday", I would calculate, based on the date today, the milliseconds
that corresponds to "yesterday". I think a cached filter would be
appropriate because until the day changes at 24:00, I could keep using the
cached filters.

I understand that there would be many filters associated with my
field "publish_date", one for each possible value {today, yesterday, past
week, past month}.

I would like to clean all the range filters associated with the
field "publish_date".

Specific filters make less sense, since they are so discrete you would
want to have some sort of way to group them possibly

Is this covered with my explanation of publish_date and the possible
values as a way to group them, or why do you think this won't be useful?

Thanks,
Sebastian.

On Mon, Mar 21, 2011 at 7:11 PM, Shay Banon <shay.banon@elasticsearch.com

wrote:

I think there is a confusion here... . There is the filter cache,
which is used when using certain filters in different places when searching,
and field cache, which is used for things like range faceting, date
histogram and the like. I did not understand which one are you going to use
in order to implement it (not enough info to guess).

There is no API to clear specific field/fields from the field level cache,
but can be added. Though, in your case (if you are going to use it), it does
not make sense to clear it, since you will still use it next time around.
Specific filters make less sense, since they are so discrete you would want
to have some sort of way to group them possibly. They do get cleared when
memory becomes scarce.

-shay.banon

On Tuesday, March 22, 2011 at 12:04 AM, Sebastian wrote:

Hi all,

I have a use case where a user can search by date recency, in fixed
intervals, eg: {today, yesterday, past week, past month}

I was planning to use field cache, as the filter is going to be used
repeatedly, but I would need to call cache clear for that specific
date filter alone, one every new day (at 24:00).

The problem is that the clear cache API, as I understood from the docs
and Java code of ClearIndicesCacheRequest, clears all the caches of a
certain type and index as a minimum granularity. I would like to clear
just one field cache, for example "publish_date". I could of course
clear all the caches, but I would loose a lot of other important
fields that won't have changed just for one field that did change.

Is clearing the cache of a single field (or list of fields) possible
somehow today? Can I add a feature request for it? Any other ideas?

Thanks,
Sebastian.


(system) #10