Best Practises to handle an index of of 1.2TiB

carmine.fabrizio · February 1, 2021, 1:08pm

Hi there,
I have an index of 1.2TiB, split into 9 shards of 200GiB each, also I do have a lifecycle policy that will roll the shards in HOT/WARM/COLD phases.
Are there any best practices that I can introduce in order to facilitate the search?

Thanks

Vadim_Yakhin · February 1, 2021, 6:02pm

Hi Carmine,

Just to clarify, have you indexed the data through the Elastic Enterprise Search (which includes App Search and Workplace Search), or are you using elasticsearch? If it's elasticsearch, Elastic Stack subforum would be a better place to ask this question.

carmine.fabrizio · February 2, 2021, 11:17am

Hello @Vadim_Yakhin ,
no have not used Elastic Enterprise Search, using normal elasticsearch.
Cannot create any topic into Elastic Stack subforum

Christian_Dahlqvist · February 2, 2021, 11:19am

Are you using time based indices? Can you elaborate on the use case?

carmine.fabrizio · February 2, 2021, 11:24am

not using time based indices, the major issue is on search

and I am not sure where the bottle neck is

Christian_Dahlqvist · February 2, 2021, 11:28am

If you do not have time based indices ILM and zoning does not really apply. What is the issue with searching? It looks like load has increased but latencies are seemingly low.

carmine.fabrizio · February 2, 2021, 12:09pm

there is a misunderstanding here, I'll clarify by sharing some configuration.
this is the configuration that I'm passing when creating an ILM

PUT _ilm/policy/test-logs
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_age": "1d",
            "max_size": "200gb"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "data": "warm"
            }
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 9
          }
        }
      },
      "cold": {
        "min_age": "14d",
        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "data": "cold"
            }
          },
          "freeze": {},
          "set_priority": {
            "priority": 20
          }
        }
      },
      "delete": {
        "min_age": "60d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

then I'll pass the template

{
  "index": {
    "lifecycle": {
      "name": "test-logs",
      "rollover_alias": "test-logs"
    },
    "routing": {
      "allocation": {
        "require": {
          "data": "hot"
        }
      }
    },
    "refresh_interval": "5s",
    "number_of_shards": "9",
    "number_of_replicas": "0"
  }
}

and this is what I have configured on the logstash as output

output {

    elasticsearch {
	hosts => ["elasticxxxxx.local"]
	ilm_enabled => true
	ilm_rollover_alias => "test-logs"
	template_name => "test-logs"
	manage_template => false
	index => "test-logs"
	ilm_pattern => "000001"
	ilm_policy => "test-logs"
    }
}

normal behavior

last 12h during office time

system · March 2, 2021, 12:10pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow querying of elasticsearch logs Elasticsearch	6	425	October 18, 2022
Trouble with my ILM Elasticsearch ilm-index-lifecycle-management	5	331	March 22, 2023
Is there a recommendation on the number of Indices that can be created using ILM Elasticsearch ilm-index-lifecycle-management	10	1290	March 20, 2023
Optimizing Index that has grown far too large, suggested settings based on experience needed! Elasticsearch	2	2285	April 17, 2020
ILM helm, indexes are bigger the 50g Elasticsearch ilm-index-lifecycle-management	2	305	August 9, 2022

Best Practises to handle an index of of 1.2TiB

Related topics