Speed Up Date range query

Hello every body,

First of all I would like to thank you for your help.

My problem is about query slow response. I have read a lot in the forum about the cluster configuration, shards and so on, and this is finally my Cluster configuration:

  • 1 Dedicated Master Node. Let’s call Node 1.
  • 2 Nodes dedicated for Ingest/Write and store Data. Let’s call Node 2 & Node 3.
  • 2 Coordinate Node that can be elected as Master and for query purpose, also to run kirbana. Let’s call Node 4 & Node 5.
  • 1 Node with Data Only & Query “back up”. Let’s call Node 6.
  • 1 Node with Ingest/Write and store Data as “back up” Let’s Call node 7.

So when I make write queries (We are doing with NEST API on C# program), we do on our test from 2 different instances, with this pool of connection:

  • Instance 1:
    • Node 2.
    • Node 7.
    • Node 3
  • Instance 2:
    • Node 3.
    • Node 7.
    • Node 2.

Each node has the same configuration W10, Intel i5, 8GB(September will be 32GB), 256GB SSD.

Versions

  • elasticsearch 7.8
  • kirbana 7.8

We are having good results for us on write (amount 20K of documents stored per second).

Our Index & Mapping has been created with this pattern:

    PUT /pocIndex
    {
      "settings": {
        "number_of_shards": 4,
        "index.auto_expand_replicas": "0-all",
        "refresh_interval": -1,
        "index.requests.cache.enable": true,
        "index.max_result_window": "1000000",
        "sort.field" : "date", 
          "sort.order" : "desc" 
      },
      "mappings": {
        "properties": {
        "date": {
          "type": "date",
          "store": true
        },

        "topic_id": {
           "type": "keyword",
          "store": true,
          "norms": false
        },
        "topic": {
          "type": "keyword",
          "store": true,
          "norms": false
        },
        "tag_id": {
                "type": "keyword",
          "store": true,
          "norms": false
        },
        "tag_position": {
          "type": "keyword"
        },
        "device_criteria": {
                "type": "keyword",
          "index": false
        },
        "device_type": {
                "type": "keyword",
          "index": false
        },
        "job_id": {
          "type": "keyword",
          "store": true,
          "norms": false
        },
        "value": {
          "type": "float",
          "store": true,
          "index": false
        },
        "trigger_value": {
          "type": "integer",
          "store": true,
          "index": false
        }
      }
      }

The data we are storing is about 7 million documents per day.

So here comes the problem, on our user case we have to make two kinds of queries:

  • Get All the mount data of one hour.
  • Getting a specific topic in a date range.

Queries have been done with kirbana.

Don’t know how to manage this because as you can see there is no a lot to configure with first query, but with this I’m having a response of 16-20 seconds with a hits value of 279612886, but if I Change the date to a minor range I have a 5-10 seconds response with a hits values of 270755, In my opinion both results are too slow, I expected to have more or less 7 million documents in 1-2 seconds, am I crazy????:

    GET / pocIndex /_search
    {
     "size": 300000, 
      "_source": ["topic", "tag_id","job_id","value","date"], 
       "track_total_hits": true,
      "query":{
          "bool":{
              
              "filter": [
                {
                  "range": {
                      "date": {
                        "gte": "2019-07-20T00:00:00",
                        "lte": "2020-07-21T02:00:00"
                      }
                  }
                }
              ]
          }
      }
    }

Also I have another issue when I make this query, that I can’t understand:

> GET /_cat/thread_pool/search?v&h=node_name,name,active,rejected,completed

node_name name active rejected completed
Node 4 search 0 0 0
Node 2 search 0 0 46139
Node 5 search 0 0 50812
Node 6 search 0 0 193370
Node 1 search 0 0 74995
Node 7 search 0 0 154707
Node 3 search 0 0 81846

How can be possible Node 4 has nothing when the purpose is to route queries? And Node 1, why has queries if In my app I never send anything to him and is not in the pool connection??

The cluster structure does not make sense to me. Just because you CAN have dedicated node types does not mean you SHOULD for a small cluster like this. I would recommend the following:

  • You should always look to have at least 3 master eligible nodes in a cluster. I would recommend setting up 3 nodes that hold data and are master eligible. If these turns out to be heavily loaded you can instead split out 3 dedicated master nodes.
  • If you are using a lot of ingest pipelines set up 2 dedicated ingest nodes. These can also work as coordinating nodes and host Kibana.

Elasticsearch was never designed for retrieving large amounts of data. If you need to retrieve more than 10000 results you should use a scroll query, which will require multiple round trips. This is generally not meant to serve real time queries and can be slow.

For optimal performance of large scan queries you should let Elasticsearch use the default sorting order.

Many thanks Chistian for your answer,

According to your recommendations:

You should always look to have at least 3 master eligible nodes in a cluster. I would recommend setting up 3 nodes that hold data and are master eligible. If these turns out to be heavily loaded you can instead split out 3 dedicated master nodes.

I'ts almost what we have: 1 Dedicated Master Node & 2 Nodes that are master elegibles, this configuration was done after reading some topics (I think I saw you in that discussion too :slight_smile: ).

With the data Nodes we have 4 data nodes, 3 of them ingest too --> are you saying that may be would be a better idea to have ingest node separated from the data nodes?

By the way, writing queries are fast enough for us, is not our bottle neck.

Elasticsearch was never designed for retrieving large amounts of data. If you need to retrieve more than 10000 results you should use a scroll query, which will require multiple round trips. This is generally not meant to serve real time queries and can be slow.

We do not need to retrieve data in real time, but we need to be as fast as can be. The amount data to retrieve depends on the query.

  • Queries type 1: All Data in one hour range should have right now 300K documents, but should be scalable because with the calculations we have made we could be talking about 2 million data in one hour.

  • Queries type 2: Specific Topic in a date range, this depends on the range the user selects, but the most common is between 9 days to 1 month (2K data to 1 million).

Scroll query could be faster than getting all data at once?

For optimal performance of large scan queries you should let Elasticsearch use the default sorting order.

I don't understand this recommendation, I read in one discussion that for optimal queries you should have the sorting as you are going to search. My thought was that if we look for the last data inserted shoud ble a good idea to stored with that sort. am I wrong??

Many thanks!!

I generally try to minimise the number of different node configurations as it IMHO makes troubleshooting performance issues easier. Avoiding custom sorting means less work for the nodes and generally better performance.

Thanks again,

But some morequestions about the problem:

  1. Do you think the amount data we are trying to retrieve is too much?
  2. Could it be possilbe to improve/speed up the query somehow?
  3. Can the mapping be changed in anyway in order to help the performance?
  4. This cluster configuration with 7 nodes is the one we thouht would be best for this purpose but can be changed, do you think it is possible to change it to improve the reads queries.

Right now we are on test time and we can change an make any new configuration without any problem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.