Count Indices, Snap and delete with curator

To write a python script to snapshot old indices, I need to get total count of indices in the Elasticsearch. Tried following command and they gave list of indices not the count

GET /_cluster/state/_all/

GET graylog*/_aliases?pretty

Any clear command to get the count of indices ?

There is not, maybe just use a curl with a wc?

Like this:

curl -s http://esclient:9200/_cat/indices | wc -l
96
2 Likes

@theuntergeek Thanks, it works.

I want to create schedule process to take snapshot and delete old indices.
The elasticsearch plugin curator do it in easiest way. But reference documents are less. I am using elasticsearch version 2.x and will curator version 4.x works with Elasticsearch 2.x ?

If you have a sample syntax please share it for creating snapshot once exceed total number of indices 50 and delete the the indices which snapped.

Condition
Total numbers of indices should be kept in the elasticsearch = 50
If Total numbers of indices > 50
create a snapshot of oldest indices
successful snapped indices should be deleted from elasticsearch

Yes.

Try the count filter

I'm not in the habit of writing complete configurations for others, but rather helping them sort them out when they don't work. Teach a man to fish, rather than give a man a fish.

I tried to list down and print all indexes from below code referring to the curator documentation (https://curator.readthedocs.io/en/latest/examples.html)

code:

from elasticsearch import Elasticsearch
import curator

es  = Elasticsearch(['http://192.168.1.1'], port=9200)
ilo = curator.IndexList(es)
ilo.filter_by_regex(kind='prefix', value='cars_')
print(ilo)

But following code gives an error as below

<curator.indexlist.IndexList object at 0x162d590>

How do I list indexes or indices before set delete condition ?

That's a great question. At issue, in your code, ilo is an IndexList object. Within that object is the metadata and index list. The current actionable list of indices can be accessed with .indices:

print(ilo.indices)

However, that will print as a list:

['index1', 'index2', 'index3',...'indexN']

If you want them printed on a single line, you'd have to loop through them:

for idx in ilo.indices:
    print(idx)

You may even want them sorted alphabetically:

for idx in sorted(ilo.indices):
    print(idx)

@theuntergeek Thanks. Code work perfectly.

But one thing I have to sort out. The output indices are not in a proper order. So I can't get exact indices to perform an operation. Are there any method to pass parameter in "ilo.indices" to arrange the output in proper order.

output indices are like below

print(ilo.indices)

[u'car_19', u'car_18', u'car_31', u'car_30', u'car_51. ....

Order technically doesn't matter, but if you want it ordered in the output, use sorted like I do above:

print(sorted(ilo.indices))
print(sorted(ilo.indices))

above command sort in wrong way

car_0
car_1
car_10
car_11
car_12
car_13
car_14
car_15
car_16
car_17
car_18
car_19
car_2
car_20
car_21
car_22
car_23
car_24

It not sort like

car_0
car_1
car_2
car_3
.
.
car_10

So according to the print(sorted(ilo.indices)) syntax, it will not list the indicies accending order. Can I sort indices with time (indices created time) ?

Again, to reiterate, the order technically does not matter. Your filters, including sorting by age or pattern, have been applied appropriately. This working list would be passed to whichever action exactly as-is.

If you want to jump through flaming hoops just to make it output in a more user-friendly fashion, that's outside the scope of Curator's API.

I used following python script syntax to create a snapshot of an indices using curator library.

repository_name = 'backup_repo'
get_indicies = 'car_1'
snap_indices = curator.Snapshot(get_indicies, repository=repository_name,  name=get_indicies)
snap_indices.do_action()

So how to I get the true or false value once Snapshot successfully performed ?

Will 'wait_for_completion=True' returns a value of success of the operation ?

That's now how this works. If you read the API Documentation for the snapshot action, you'll find that the first argument expected is an IndexList object, like the ilo you created in your other example. You can't just pass a variable with a single index in it. It will fail with an exception, because you will have passed a non-IndexList object to it. Part of the reason for this need is that the IndexList object has a client object inside, and the snapshot action object requires this client to perform a snapshot.

You don't. It simply doesn't raise an exception.

No. If you have properly set up logging, however, you will be able to watch it check every wait_interval seconds (the default is 9) and report the progress of the snapshot. It will report the snapshot state based on this method.

If you want to check the state of the snapshot afterward, a value of SUCCESS means it was successful. This value can be obtained by running:

>>> from elasticsearch import Elasticsearch
>>> import curator
>>> es  = Elasticsearch(['http://192.168.1.1'], port=9200)
>>> d = es.snapshot.status(repository='REPOSITORY_NAME', snapshot='SNAPSHOT_NAME')
>>> d['snapshots'][0]['state']
u'SUCCESS'

The dictionary that comes back will have the status at the indicated path. It will always be at index 0 if you only specify one snapshot at snapshot='SNAPSHOT_NAME'.

The value can be one of: 'SUCCESS', 'PARTIAL', 'FAILED', 'IN_PROGRESS'.

1 Like

@theuntergeek

I have take snapshot indices one by one on following code and I need to add delete indices inside this code.

ilo = curator.IndexList(es)
ilo.filter_by_age(source=SOURCE, direction=DIRECTION, unit=UNIT,   unit_count=UNIT_COUNT)
		
		for idx in ilo.indices:
            get_indicies = idx.strip("[u']")
			
			snap_indices = curator.Snapshot(ilo, repository=self.repository_name, name=get_indicies)
			snap_indices.do_action()
			snap_indices_status = snap_indices.get_state()

            if snap_indices_status == 'SUCCESS':
				delete_indices = curator.DeleteIndices(ilo)
		        delete_indices.do_action()
			elif snap_indices_status == 'FAILED':
				print "Failed"

But how do I pass only the indices name which snapped to delete, instead of delete all old indices

As per the above example

  delete_indices = curator.DeleteIndices(ilo)
  delete_indices.do_action()

will delete all old indices which meet the filtering criteria.

In Action class of DeleteIndices takes only 2 arguments ilo and master_timeout

 class curator.actions.DeleteIndices(ilo, master_timeout=30)

Can I pass single indices name to above class as a parameter ?

My requirement is as below
If indices are older than 1 months

  1. Backup or snapshot the oldest indices.
  2. If the backup successful, delete that oldest indices from elasticsearch.
  3. Then do the snap of next old indices ....if ok delete it.. and continue ....

One by One at a time snap & delete

and

This suggests a misunderstanding of how snapshots work. This is not recommended procedure. I recommend reading this comment to better understand how snapshots work in Elasticsearch, and why Curator does not support or suggest one-index-per-snapshot snapshots.

No. To get to a single index, you have to apply filters.

1 Like

@theuntergeek

As you said, in order to get a single index what parameter should I have to set in

 IndexList.filter_by_age

 IndexList.filter_by_age(source='name', direction=None, timestring=None, unit=None,   unit_count=None, field=None, stats_result='min_value', epoch=None, exclude=False)

and DeleteIndices class

class curator.actions.DeleteIndices(ilo, master_timeout=30)

can I pass the single index as below ?

es  = Elasticsearch()
ilo = curator.IndexList(es)
ilo.filter_by_age(source='creation_date', direction='older', unit='days', unit_count=280)

delete_indices = curator.DeleteIndices(ilo, name=get_indicies)
delete_indices.do_action()

Here pass name argument. (name=get_indicies - single index name). Is this DeleteIndices function get name argument as a valid parameter or if it raise any error ?

curator.DeleteIndices(ilo, name=get_indicies)

For myself I got following error

Error: __init__() got an unexpected keyword argument 'name'

No, you can't. The Action API calls exclusively depend on either IndexList or SnapshotList objects. They do not accept individual indices or snapshots as calls. You create the necessary object, filter out all of the indices you do not want to perform the action on, and then pass that object to the Action object.

You seem very determined to ignore the best practices recommendation I presented earlier, and go ahead with 1 index per snapshot. I wish you the best of luck with that. You will have to figure out how to make that work using the plain elasticsearch Python client module. Curator's API will not help you with what you're trying to do. It was designed to fit the use case I presented in the best-practices recommendation. What you are seeking to do is outside that scope, and so the Curator API is perhaps not the best fit for you.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.