Index Backup, Archiving and Purging Strategy

Hi,

We have a scenario such that there is one index(5 shards) which holds 2 million records and increasing by time rapidly. Rate of Index size reaches around 200 GB(2 Million records) in 15 days.

Each record has nested documents as well. Attaching a sample record.

We observed that reading and writing on the same index is slow. We are trying to make this active index lighter to increase the performance of query and indexing by transferring data in another indices(as warm readonly) where we can still search the data. And after a time duration we can purge the indexes after expiry of data( around 1 year).

Here, as per my knowledge, what we are expecting is similar to Index Lifecycle Management(ILM) where HOT, WARM and COLD indices are categorized.

In our case, we want to transfer the data based on some fields in the document like creationDate or status of the document.

We have thought of using curator jobs to run reindexing which supports query on the document for selective document reindexing. And after that rolling over to COLD index.

Also, we are planning to _forcemerge WARM and COLD readonly index.

Are we going to the right path or could you please recommend more efficient way to handle this scenario? Does this make difference in indexing and querying performance?

Record:

{
	"id": "record id",
	"fieldB": "sdfsdfsdf",
	"fieldC": "sgfdgdfgg",
	"fieldD": "MINOR",
	"fieldE": 40,
	"fieldF": 3,
	"fieldG": "sdfsdfsdf",
	"fieldH": "Value",
	"timeRaised": "2019-09-10T16:51:15.015Z",
	"timeChanged": "2019-09-10T16:51:15.015Z",
	"statusChangeDate": "2019-09-05T16:51:15.015Z",
	"status": "SUBMITTED",
	"fieldI": "E2E Service Team",
	"fieldJ": [
		{
			"id": "rtertretret",
			"name": "dfgdfgdfgfd",
			"type": "dfgdfgdfg"
		}
	],
	"fieldK": [
		{
			"id": "sdfsdfdg1",
			"name": "dfgdfgdfg",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		},
		{
			"id": "sdfsdfdg2",
			"name": "dfgdfgdfg",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		},
		{
			"id": "sdfsdfdg3",
			"name": "dfgdfgdfg",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		},
		{
			"id": "sdfsdfdg4",
			"name": "dfgdfgdfg",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		}
	],
	"fieldL": [
		{
			"id": "sdfsdfsdf",
			"rank": 100
		}
	],
	"fieldM": [],
	"fieldN": [],
	"fieldO": [
		{
			"id": "6d31695c-d460-43d1-a414-e8a3422d0d29s",
			"name": "sdfsdf",
			"type": "VsdfM",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		},
		{
			"id": "6d31695c-d460-43d1-a414-e8a3422d0d29w",
			"name": "sdfsdf",
			"type": "VsdfM",
			"sub-field1": "NORMAL",
			"sub-field2": "Product",
			"sub-field3": "sdfsdfsdf",
			"sub-field4": "NORMAL"
		}
	],
	"fieldP": [],
	"fieldQ": [],
	"fieldR": [],
	"fieldS": [],
	"fieldT": "",
	"fieldU": 0,
	"fieldV": 0,
	"fieldw": ""
}

Thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.