Elasticsearch 7.0 SQL ACCESS request cache

Looks like SQL query is not cached, and equivalent dsl qeury is cached.

Every single time this SQL query is executed, it takes over 3 seconds. (The number of documents is about 6,000,000)

    POST /_sql?format=json
    {
        "query": "SELECT HISTOGRAM(\"batchDate\", INTERVAL 1 DAY) AS \"h_date\", SUM(\"aggrCount\") AS \"cnt\" FROM \"my.index-*\" WHERE \"batchDate\" < 1560988799999 AND \"batchDate\" >= 1560340034000 GROUP BY \"h_date\""
    }

And when I'm using QueryDSL using request body got from /_sql/translate API with the sql query above, it took over 3~5 seconds only first time, and next it took under 1ms. Maybe it might be cached.

    POST /my.index-*/_search
    {
      "size" : 0,
      "query" : {
        "range" : {
          "batchDate" : {
            "from" : 1560340034000,
            "to" : 1560988799999,
            "include_lower" : true,
            "include_upper" : false,
            "boost" : 1.0
          }
        }
      },
      "_source" : false,
      "stored_fields" : "_none_",
      "aggregations" : {
        "groupby" : {
          "composite" : {
            "size" : 1000,
            "sources" : [
              {
                "25579" : {
                  "date_histogram" : {
                    "field" : "batchDate",
                    "missing_bucket" : true,
                    "value_type" : "date",
                    "order" : "asc",
                    "interval" : 86400000,
                    "time_zone" : "Z"
                  }
                }
              }
            ]
          },
          "aggregations" : {
            "25580" : {
              "sum" : {
                "field" : "aggrCount"
              }
            }
          }
        }
      }
    }

@c81b4c93b1f70af5398f the json we generate as a query can be different every time you call translate. The relevant parts of the query will not be different, obviously, but the name of the aggregations, for example, can be different - 25579 and 25580 from your example. And I think caching works on the body of the query (the json itself). If you generate the query once, run it several times, and then generate it again with translate and run it, does it take seconds or returns instantly?

Hi

Have you setup elasticsearch cluster with V 7 at least 3 node cluster ?

Nuwan

The name of aggregations depends like you mentioned in translate API.
I repeated the query dsl request several times after generating translated json body(dsl) once, and it got much faster after several times. But it gets fast only with translated dsl that has same name of aggregations. If I generate new translated json body(with another aggregation name), it takes long again.
When it comes with SQL, It takes over 3 seconds even after several times.
Can I fix the name of aggregations when requesting by SQL?

Hi nuwan.
My cluster has 9 nodes and every node is V 7.0.0

Hi

Thank you very much for quick respond my configuration is look like below . appciate if you can help me to setup 3 node cluster . I'm going go update my 3 node production cluster from v 6.5.1 to v 7.1.1

[root@orc-app1 ~]# cat /etc/elasticsearch/elasticsearch.yml
cluster.name: ElasticDemO
node.name: ${HOSTNAME}
network.host: 192.168.60.4
xpack.security.enabled: false
bootstrap.system_call_filter: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
discovery.seed_hosts:

  • 192.168.60.4:9300
  • 192.168.60.5:9300
  • 192.168.60.6:9300
    cluster.initial_master_nodes: ["192.168.60.4","192.168.60.5","192.168.60.6"]
    http.cors.enabled: true
    discovery.zen.minimum_master_nodes: 2
    node.master: true

[root@orc-app1 ~]#

http://192.168.60.4:9200/_cat/nodes?v

ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.60.4 25 94 5 0.00 0.02 0.05 mdi * orc-app1.dev

appreciate if you can send me sample configuration file or if you can go through my configuation file and let me know what need to be modified

@nuwan please, open a new thread of discussion and keep this discussion focused on the initial issue. Thank you.

yes I did still I'm unable to get the answer for that appreciate if any one help me regarding my concerns Any one setup 3 node cluster with elasticsearch v 7

Thank you

@c81b4c93b1f70af5398f no, you cannot control the name of the generated aggregation in a query, everything is transparent and ES SQL should behave just like any other SQL facing tool.

I've created a github issue and we'll discuss internally if this is feasible to implement: https://github.com/elastic/elasticsearch/issues/43531

Thanks for your follow up. Hope to be fixed soon.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.