Best Practises to handle an index of of 1.2TiB

Hi there,
I have an index of 1.2TiB, split into 9 shards of 200GiB each, also I do have a lifecycle policy that will roll the shards in HOT/WARM/COLD phases.
Are there any best practices that I can introduce in order to facilitate the search?

Thanks

Hi Carmine,

Just to clarify, have you indexed the data through the Elastic Enterprise Search (which includes App Search and Workplace Search), or are you using elasticsearch? If it's elasticsearch, Elastic Stack subforum would be a better place to ask this question.

Hello @Vadim_Yakhin ,
no have not used Elastic Enterprise Search, using normal elasticsearch.
Cannot create any topic into Elastic Stack subforum

Are you using time based indices? Can you elaborate on the use case?

not using time based indices, the major issue is on search

and I am not sure where the bottle neck is

If you do not have time based indices ILM and zoning does not really apply. What is the issue with searching? It looks like load has increased but latencies are seemingly low.

there is a misunderstanding here, I'll clarify by sharing some configuration.
this is the configuration that I'm passing when creating an ILM

PUT _ilm/policy/test-logs
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_age": "1d",
            "max_size": "200gb"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "data": "warm"
            }
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 9
          }
        }
      },
      "cold": {
        "min_age": "14d",
        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "data": "cold"
            }
          },
          "freeze": {},
          "set_priority": {
            "priority": 20
          }
        }
      },
      "delete": {
        "min_age": "60d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

then I'll pass the template

{
  "index": {
    "lifecycle": {
      "name": "test-logs",
      "rollover_alias": "test-logs"
    },
    "routing": {
      "allocation": {
        "require": {
          "data": "hot"
        }
      }
    },
    "refresh_interval": "5s",
    "number_of_shards": "9",
    "number_of_replicas": "0"
  }
}

and this is what I have configured on the logstash as output

output {

    elasticsearch {
	hosts => ["elasticxxxxx.local"]
	ilm_enabled => true
	ilm_rollover_alias => "test-logs"
	template_name => "test-logs"
	manage_template => false
	index => "test-logs"
	ilm_pattern => "000001"
	ilm_policy => "test-logs"
    }
}

normal behavior

last 12h during office time

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.