Elasticsearch doesn't cache query with rounded date

I originally posted the question to Stack Overflow but I guess it's more appropriate to ask here. (For reference, the original question is here - caching - Elasticsearch doesn't cache query with rounded date - Stack Overflow

I'm using this Elasticsearch query:

{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {"term": {"address.country.keyword": "Germany"}},
            {"term": {"address.city.keyword": "Berlin"}},
            {"range": {"timestamp": {"gte": "now-3d/h"}}}
          ]
        }
      }
    }
  },
  "size": 0
}

AFAICT this should qualify the query for caching, at least based on these resources:

However, according to the /_stats/request_cache endpoint caching is not involved - hit_count stays the same. If I remove the range clause, it works. Also if I change the time specification to now/h , it also works. Am I missing something?

Elasticsearch's version is 6.2.3.

now disables the query cache. Now that I think of it, I'm not sure it has to, especially when rounded like that.

Sorry, that can't be right. I looked at the master branch and we do indeed cache now queries. And you are talking about the request cache anyway, which is "higher" than the query cache.

6.2 is quite old at this point. I'd try again in 7.12 - it may still not hit the request cache. But I think it ought to.

Thanks for response. As mentioned in the original post comment, it works in 7.9.1. However, I can't easily upgrade to 7.x due to breaking changes. I'd at least need to find the lowest version that fixed this issue. Anyway, I'd assume it would work as 6.2 documentation (linked in the question) suggests it should.

Oh! Sorry. I hadn't read closely enough. Old versions are bound to have bugs because we don't backport everything everywhere.

I'd also put everything in the filter clause instead of the must clause.

At least upgrade to 6.8 latest.

OK I'll give it a try and upgrade to 6.8, thanks.

@dadoonet I tested the exact same query on both 6.8 and newest 7.12.0 (both single-node setups running in a Docker) and caching wasn't involved in either case. At least querying /_stats/request_cache doesn't show anything. When I remove the range using now, I see a cache miss for the first time and then cache hits every time. But when there's a now involved, no caching happens. So there are basically three options I guess:

  • I'm doing something wrong. Either I specify the query incorrectly or I'm "debugging" the cache usage the bad way.
  • Whenever there's used now in the query (even when rounding is used) it prevents the cache usage. In that way I guess the documentation is misleading.
  • Or, there's a bug.

Is there anything else I could try? Both in terms of reformulating the query or debugging further.
Thanks

If you are using only now then no cache is involved I think.
now-3d/h should be cached for one hour or so I guess...

Are you running the query multiple times like 5x or 10x? I think that you need to do that to see it cached.

Yeah, I tested almost every possible combination. When using plain now, it's not cached as expected. But the cache is not involved even when rounding using /d or /h. I created a tiny "test suite" which contains a script that send prepared queries multiple times and another script that monitors the request cache and also query cache using the _stats endpoint. The query cache is leveraged, however the shard-level cache is not.

I wonder if this excellent blog post written by @spinscale would help to understand what is happening?

Yeah, maybe it's just that I don't really understand what's going on. I'll go through the article and get back if there are any concerns left. Thanks a lot for the sharing!

I've read the article but I wouldn't say I'm convinced I understand it significantly better. Especially with regards to the usage of now and date math / rounding. I made a final test with three different queries which are hits only, aggs only and combination of both:

Query 1 (hits only):

{
              "query": {
                "constant_score": {
                  "filter": {
                    "bool": {
                      "filter": [
                        {"term": {"address.country.keyword": "Česko"}},
                        {"term": {"address.city.keyword": "Praha"}},
                        {"range": {"last_seen": {"gte": "now-3d/d"}}}
                      ]
                    }
                  }
                }
              },
              "size": 5
            }

Query 2 (aggs only):

{
              "query": {
                "constant_score": {
                  "filter": {
                    "bool": {
                      "filter": [
                        {"term": {"address.country.keyword": "Česko"}},
                        {"term": {"address.city.keyword": "Praha"}},
                        {"range": {"last_seen": {"gte": "now-3d/d"}}}
                      ]
                    }
                  }
                }
              },
              "aggs": {
                "area": {"stats": {"field": "area"}}
              },
              "size": 0
            }

Query 3 (both hits and aggs):

{
              "query": {
                "constant_score": {
                  "filter": {
                    "bool": {
                      "filter": [
                        {"term": {"address.country.keyword": "Česko"}},
                        {"term": {"address.city.keyword": "Praha"}},
                        {"range": {"last_seen": {"gte": "now-3d/d"}}}
                      ]
                    }
                  }
                }
              },
              "aggs": {
                "area": {"stats": {"field": "area"}}
              },
              "size": 5
            }

I was sending these queries against Elasticsearch in an infinite loop (with 200 ms delay between requests) and monitored the request and query cache usage with

watch -n 1 "curl -s -X GET http://localhost:9200/_stats | jq '.indices.realestate.total | {request_cache, query_cache}'"

The results for each examined version of Elasticsearch are here:
6.2.3 (version I have in production): request cache not used for any query, query cache not used for any query
6.8.15: request cache not used for any query, query cache used for all three queries
7.12.0: request cache not used for any query, query cache used for all three queries

So, I'm not really sure what's going on, especially why request cache is not used. Is there anything else I can do to understand it any better and I can do to leverage cache (effectively any, whatever can speed up the queries) in 6.2.3 while using now with rounding?

Hey,

I tried to create a minimal reproduction to check out the caching behaviour. Minimal implies, that the query cache will not be used, but I suppose that is fine for our testing.

DELETE test

PUT test 
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  }
}

PUT test/_doc/1
{
  "address" : {
    "country" : "T1",
    "city" : "T2"
  },
  "last_seen" : "2021-04-10T12:34:56.789Z",
  "area" : 10.0
}

PUT test/_doc/2
{
  "address" : {
    "country" : "T1",
    "city" : "T2"
  },
  "last_seen" : "2021-04-11T12:34:56.789Z",
  "area" : 10.0
}

PUT test/_doc/3
{
  "address" : {
    "country" : "T1",
    "city" : "T2"
  },
  "last_seen" : "2021-04-12T12:34:56.789Z",
  "area" : 10.0
}

# verify fields are mapped correctly
GET test/_mapping

GET test/_stats?filter_path=_all.primaries.request_cache,_all.primaries.query_cache

# also try without the parameter
GET test/_search?request_cache=true
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "filter": [
            {
              "term": {
                "address.country.keyword": "T1"
              }
            },
            {
              "term": {
                "address.city.keyword": "T2"
              }
            },
            {
              "range": {
                "last_seen": {
                  "gte": "now-3d/d"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "area": {
      "stats": {
        "field": "area"
      }
    }
  },
  "size": 0
}

So this indexes three documents matching your query (well right now due to the now part). On 7.12. this immediately fills up the request cache even if I do not specify request_cache=true in the URL?

Is this behaviour different to yours?

That said, the query cache will not work, because of not having enough data as mentioned in the blog post, but let's try, one caching at a time :slight_smile:

--Alex

Hi Alex, thanks a lot for staying with me.
When I run the exact sequence of calls you provided, I get a request cache miss for the first time. What's weird to me is that when I subsequently call the search API with the exact same query, the cache stats stay the same:

{
  "_all" : {
    "primaries" : {
      "query_cache" : {
        "memory_size_in_bytes" : 0,
        "total_count" : 0,
        "hit_count" : 0,
        "miss_count" : 0,
        "cache_size" : 0,
        "cache_count" : 0,
        "evictions" : 0
      },
      "request_cache" : {
        "memory_size_in_bytes" : 0,
        "evictions" : 0,
        "hit_count" : 0,
        "miss_count" : 1
      }
    }
  }
}

To provide complete information, here's the exact version I'm using:

{
  "name": "f09a1378ed75",
  "cluster_name": "es-test",
  "cluster_uuid": "ZzobHNUiR86jwCS9iKZ7Qg",
  "version": {
    "number": "7.12.0",
    "build_flavor": "default",
    "build_type": "docker",
    "build_hash": "78722783c38caa25a70982b5b042074cde5d3b3a",
    "build_date": "2021-03-18T06:17:15.410153305Z",
    "build_snapshot": false,
    "lucene_version": "8.8.0",
    "minimum_wire_compatibility_version": "6.8.0",
    "minimum_index_compatibility_version": "6.0.0-beta1"
    },
  "tagline": "You Know, for Search"
}

Just to get this right: Even when you configure via the parameter to use the cache, it does not get used?

I also tested on 7.12. but not on docker. How much memory are you giving to the Elasticsearch process and are there any configuration options that you set?

--Alex

Yeah, even when I use the query parameter request_cache=true, it doesn't use it.

I use this custom Dockerfile:

FROM docker.elastic.co/elasticsearch/elasticsearch:7.12.0

RUN ./bin/elasticsearch-plugin install analysis-icu

RUN mkdir -p ./config/hunspell/cs_CZ \
 && cd ./config/hunspell/cs_CZ \
 && curl -O https://issues.apache.org/jira/secure/attachment/12541597/cs_CZ.aff \
 && curl -O https://issues.apache.org/jira/secure/attachment/12541598/cs_CZ.dic \
 && echo "strict_affix_parsing: false" > ./settings.yml

This is the command I use to run the Docker container:

docker run --name=elastic -d -p 9200:9200 -p 9300:9300 --restart=always \
  -e ES_JAVA_OPTS="-Xms2g -Xmx2g" -e "cluster.name=es-test" \
  -e "discovery.type=single-node" -e "bootstrap.memory_lock=true" \
  --ulimit memlock=-1:-1 \
  elasticsearch:test

The physical host has 4 GB of RAM.

Can you share the cluster state/nodes stats/nodes info output somewhere in a gist? The shard level request cache can be deactivated dynamically via the cluster update settings API, but is enabled by default. The setting is index.requests.cache.enable.

Here is the requested output - https://gist.github.com/tlinhart/857a46ff46fd6d0afeb43fa04dd62920. If there is anything else I could provide, please let me know.