What are the search optimizations possible without scoring?

aqiank · November 8, 2019, 6:44am

Hi guys,

I have 30 terabytes of Elasticsearch data spread across 11 nodes. Each index is separated by month and has 5 primary shards with 5 replica shards. Some shards are big (over 200G) and some are small (less than 10G). However, I want to search quickly across all data (right now, it takes more than a minute to finish and the loads of all nodes would be maximized by a single 6-word search query). Another search on top of it would make the time 2-3 minutes.

I was thinking to have 12 nodes and each 2 nodes contain a node that has one-year worth of data for primary and another node for the replicas. I'm hoping that way, the data can be cached better for searches and there would be less communication necessary between nodes. By default, it search all data but it is also possible to search for just last year or last two years.

However, my nodes don't have very big memory (around 32G per node, half of it is for heap).
Do you guys think this is a good approach? If not, what else can I do to improve the search time?

Thanks in advance!

Christian_Dahlqvist · November 8, 2019, 7:24am

What kind of queries are you running? Have these been optimised? Are there any patterns in your queries, e.g. they typically search per user, that can be used to optimise access patterns? Given the amount of data per node storage performance is likely going to be very important. Are you using locally attached SSDs?

aqiank · November 8, 2019, 7:49am

Hi Christian, thanks for the response!

I'm running Indices query and the inner two queries are query_string with about 12 and 15 fields for two groups of indices in both inner queries respectively.

They search for keywords that would appear in those fields. The nodes are powered by VMware on SSDs (dedicated cloud servers). Each node has approximately a little more than 3TB of data.

I'm not sure if the queries have been optimized by generally it looks like this:

{
  "query": {
    "indices": {
      "indices": [
        "a*"
      ],
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "timestamp": {
                  "gte": "1990-01-01T00:00:00"
                }
              }
            },
            {
              "query": {
                "query_string": {
                  "query": "some random query by user",
                  "lenient": true,
                  "analyze_wildcard": false,
                  "fields": [
                    "o",
                    "p",
                    "q",
                    "r",
                    "s",
                    "t",
                    "u",
                    "v",
                    "w"
                  ]
                }
              }
            }
          ]
        }
      },
      "no_match_query": {
        "bool": {
          "filter": [
            {
              "range": {
                "created_at": {
                  "gte": 631152000
                }
              }
            },
            {
              "query": {
                "query_string": {
                  "query": "some random query by user",
                  "lenient": true,
                  "analyze_wildcard": true,
                  "fields": [
                    "a",
                    "b",
                    "c",
                    "d",
                    "e",
                    "f",
                    "g",
                    "h",
                    "i",
                    "j",
                    "k",
                    "l",
                    "m",
                    "n"
                  ],
                  "analyzer": "standard"
                }
              }
            }
          ]
        }
      }
    }
  }
}

There is no pattern in the search. We just have find keywords within the documents' fields.

system · December 6, 2019, 7:49am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Optimizing single-node search performance Elasticsearch	4	878	December 16, 2022
Cluster optimization(indexing/query performace) Elasticsearch	4	312	July 6, 2017
Help optimizing performance for several indices Elasticsearch	5	1729	July 5, 2017
How to increase query speed on search engine? Elasticsearch	4	315	February 23, 2022
Further optimization to ES queries / performance Elasticsearch	1	343	September 3, 2020

What are the search optimizations possible without scoring?

Related topics