ES query slows down in high concurrency

gaorui · July 17, 2024, 10:19am

We now have a large number of NetFlow records that need to be queried, with a data volume of 200,000 records/s. The query logic is to go to es to associate users based on time, aIp, start and end ports, and bIp. The specific query logic is at the end. The current problem is that the query request performance of 200,000 records/s is not enough, resulting in a large amount of Netflow data backlog, and the snowball is getting bigger and bigger.
Among them, test is writing new data synchronously, with a writing speed of 1,000 records/s. In addition, the test index is divided into tables by day, with 45GB of data per day, about 86.4 million records. When querying, today and yesterday are queried at the same time, and the data is retained for 30 days.
The current cluster configuration is 3 16C32G1T mechanical disk virtual machines
Hardware resources are limited and cannot be upgraded or upgraded to SSD
5.Query logic：

POST test_07_17/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "time": {
              "lte": 1721208610478
            }
          }
        }
      ],
      "should": [
        {
          "match": {
            "bIp": "10.22.102.203"
          }
        },
        {
          "bool": {
            "must": [
              {
                "match": {
                  "aIp": "102.13.203.209"
                }
              },
              {
                "range": {
                  "beginport": {
                    "lte": 203
                  }
                }
              },
              {
                "range": {
                  "endport": {
                    "gte": 203
                  }
                }
              }
            ]
          }
        }
      ]
    }
  },
  "sort": [
    {
      "time": {
        "order": "desc"
      }
    }
  ],
  "_source": ["username", "nasip", "mac"]
}

Christian_Dahlqvist · August 13, 2024, 8:31am

Spinning disks support limited IOPS so I am not surprised they are strugging with indexing and high concurrent querying. This is why both the guide on tuning for indexing speed as well as the guide on tuning for search speed recommend using local SSDs.

I suspect you are just constrained by lack of IOPS and would recommend you monitor IOPS, await and disk utilisation to verify this is the case. If it is I do not really think there are any workarounds or magic solutions so would recommend you upgrade to SSDs.

Topic		Replies	Views
Query Performance Elasticsearch	11	1825	July 6, 2017
Real netflow monitoring for 10 G traffic Elasticsearch	4	570	July 5, 2017
Poor performance on brand new cluster Elasticsearch	13	2921	August 23, 2018
ES query is a little bit slow, can anyone help have a look? Elasticsearch	2	774	December 8, 2017
High I/O read (100%) during query time Elasticsearch	1	955	July 5, 2017

ES query slows down in high concurrency

Related topics