Hello,
I just managed to link my graylog/elasticsearch server to my S3 bucket
curl -X PUT "localhost:9200/_snapshot/my_s3_repository?pretty" -H 'Content-Type: application/json' -d'
{
"type": "s3",
"settings": {
"bucket": "mysiem"
}
}
'
{
"acknowledged" : true
}
Graylog create a new clue every day, and I’d like to send all clues older than 2 weeks to my S3 bucket (and of course I don’t want to see them in my servers anymore)
I installed the elasticsearch-curator packet
Reading the docs, I have two problems:
-I can’t find the configuration files (example: curator.yml)
-I don’t understand how to do the configuration according to my criteria (send my indices of more than 2 weeks to S3 without keeping them on my local server)
Anyone have an idea to help me? Thank you very much
No configuration files are created during install, even from RPM/DEB packages. You must do create these yourself. Example action definitions are in the docs here. The curator.yml definition example is in the docs here.
You've created a repository already, named my_s3_repository. You would create an action file with a step something like the below. Note that I snapshot after 13, and delete after 14. This gives you a chance to ensure that snapshots did occur before they are deleted after day 14.
---
actions:
1:
action: snapshot
description: >-
Snapshot myindex- prefixed indices older than 13 days (based on index
creation_date) with the default snapshot name pattern of
'curator-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip
the repository filesystem access check. Use the other options to create
the snapshot.
options:
repository: my_s3_repository
# Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
name:
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
filters:
- filtertype: pattern
kind: prefix
value: myindex-
- filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 13
2:
action: delete_indices
description: >-
Delete indices older than 14 days (based on index name), for myindex-
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
filters:
- filtertype: pattern
kind: prefix
value: myindex-
- filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 14
Thank you for your help, what you explained helps me, but I still have a few questions:
-I see in the action file the prefix "myindex" but when I go to see my indices, they have names that are totally random not starting with "myindex", isn’t that going to be a problem?
ls /mnt/graylog/elasticsearch/nodes/0/indices/
3lurA30KRT66K3_YeUBBDA I_l4ljXmQwmzqeuDQLXrTA ROkebCqiS2SKmtnoLd5K-g
6jsIyLtkSJuw7I6FM5aeLg JZu86JSRSqe41AjVMOhJcA RuKIH_8GT_282VjvsGxxCA
71Jx5NuITr-mh-XYrBghvg kY2qgTamQ1-IsV1KfRfmLQ WNniZ76vRZm_U0xiny7OUQ
8EsR-1jRQVWwEI9XLvKLag lNzkHX6ATle462uQwH3fWg XAzvAgllTnO0GAyMEDX47Q
AdixeDwmTTOs6fDIV8yYyg lR0MpE-rS-6_aRXutCXuog XhmTQ3n2Roaxe6JCpX6aBQ
ATP7FWk_SpSqchDm2nRIKA M1YpCI7FQ-eYWh2qHWPJdw xlq-jtI4Twu0kJJsc-L9PA
B2yRoFQDRje6sSZrwUZ1yg mt3t8SLsQ_CNxdCCGgaFBw X_qzLuCyTHSqdhqBeh-0qw
BcNUyhiiS6unxRCK3PShlA myaa-7qVRbqjQq6tXblZzA YchNdqu6S-2oUMygO8EEBQ
dtKqyzB0RgiOhA8IKY02FA n47EEchMT4K1qOqysyR_yw yJbEaRg7Qvu4pxjV3W0OiQ
dYCzPuBaR6qSAF-4PqwHuQ OzlEyYTgRW-ESstP7uqksA Yl5h2WeXRNW4Kpugtj1gOg
eKJgvnJcStGsqI0PGCGxcw P6DUFVqEQo-YseozDhn15A yp8nqvp0QnqIBbMsaj_n_Q
gMVH_r3zQKGz5cJT4M1Htw p9jkLjGISjK22aXuj6YCvg YvnIgCv5QPugXtrGillH8w
gQbHFxbFQe2bQ_9-CLRudg PESgE_FwTU6kEV8rKeN5xA YXxNwkPHRmmLz52Ths43zg
GrxLvdJ1StWHOhrYrmyZhw PGES4n4WSyunJiUz5TcBtg zBebioJGRZyg3aVV2nv-Kw
hD65pvCiTY27YEaRfAYO7g Ph7nwRutQOe4e3lrm60K7Q Ze2VXGWKReq6ReoWw8uzXA
hNmUnkrZR-WP1Ksk30sSYA pLWbpvYrRWCLmTbuImQ6iQ zMacyH4ARgOYXjbFRgaUsA
hXiT4VpxRni39hg2qcN9-g QpkZaipcTNSbyyjXZk3edg
-Can I create the curator.yml and action.yml files (I don’t know if there’s another way to name it) where I want, or there are some recommended folders
-Once the two files are created, do I have to do something or will the process be automatic?
Thank you very much
No. This behavior was changed all the way back in Elasticsearch 2.x. The index names are deliberately obfuscated in the filesystem, but the cluster state knows what the index names are and maps them accordingly. This was done to force end users to use the API to interact with indices, and never directly with the file system.
They do not have to be named curator.yml and action.yml, however if you name your configuration file curator.yml and place it in your home directory in a .curator directory, you will not need to manually specify --config and the path to the configuration in the command line. See the documentation note for more details.
You must either run Curator manually or via some scheduler like cron at the intervals you require.
Thank you for your answer,
Yes it works thanks, but I still have a question about that
I created the file /etc/elasticsearch/curator.yml and configured it like this:
client:
hosts:
- 127.0.0.1
port: 9200
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
ssl_no_validate: False
username:
password:
timeout: 30
master_only: False
I also created and configured the file /etc/elasticsearch/action.yml, the configuration is a little different from yours because I just wanted to make a text:
actions:
1:
action: snapshot
description: >-
Snapshot myindex- prefixed indices older than 13 days (based on index
creation_date) with the default snapshot name pattern of
'curator-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip
the repository filesystem access check. Use the other options to create
the snapshot.
options:
repository: my_s3_repository
# Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
name:
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
filters:
- filtertype: pattern
kind: prefix
value: graylog*
- filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 90
My graylog server has been receiving data for well over 2 months
The small problem is that in my bucket, several folders are created which is very disorganized and we can not find ourselves, the file names remain incomprehensible so we do not really see the creation date of the indice (because graylog created one indice per day). Can the conf file be improved to address this concern?
For example in my folder "indices" there are two folders with each 15 files, and unable to see its which file corresponds to which day I don’t see any "currator-YmdHMS". Thank you
You should never directly interact with the files in your data path, or the files in your S3 bucket. These file names are obfuscated on purpose. You should only manage indices via the API, and you should likewise only manage snapshots via the API.
The reason you can't make sense of the snapshot folders is because they hold shard data, which are segments. These will never be named in a way you can understand (just like the index directory you shared previously). The Elasticsearch cluster knows how to interpret the metadata files and figure out where your data needs to be, so please leave it at that.