Does a snapshot repo created on default?


(B Zen) #1

I'm using Logstash with elasticsearch to index data. I don't remember setting up snapshot for my indices. But today when I look at the data folder, there is a snapshot directory created.
Any idead how this can happen? I'm using elasticsearch 1.5.0 and logstash 1.5.0

I added these to my elasticsearch.yml

bootstrap.mlockall: true
indices.memory.index_buffer_size: 50%
index.translog.flush_threshold_ops: 50000
indices.cluster.send_refresh_mapping: false
index.store.type: mmapfs
index.merge.scheduler.max_thread_count: 2
index.refresh_interval: 30s

UPD: When I grepped through the logs I got this, I think this will enable you to help me.

[2015-07-17 00:21:14,164][INFO ][repositories             ] [host] delete repository [quickrun]
[2015-07-17 00:39:02,810][INFO ][cluster.metadata         ] [host] [logstash-2015.07.17] update_mapping [logs] (dynamic)
[2015-07-17 00:56:00,728][INFO ][cluster.metadata         ] [host] [logstash-2015.07.17] update_mapping [logs] (dynamic)
[2015-07-17 01:02:02,706][INFO ][cluster.metadata         ] [host] [logstash-2015.07.17] update_mapping [logs] (dynamic)
[2015-07-17 01:10:46,590][INFO ][cluster.metadata         ] [host] [logstash-2015.07.17] update_mapping [logs] (dynamic)
[2015-07-17 01:12:26,449][INFO ][cluster.metadata         ] [host] [logstash-2015.07.17] update_mapping [logs] (dynamic)
[2015-07-17 01:17:21,934][INFO ][repositories             ] [host] put repository [quickrun]
...
[2015-07-17 11:19:04,103][INFO ][snapshots                ] [host][snapshot [quickrun:1437157048-_all] is done

(Mark Walkom) #2

Someone must have set one up.

Check your logs and also look at the _snapshot API to get some more details.


(B Zen) #3

@warkolm , I killed my old es cluster. Cleared up all the data. Elastic search ran without any problems for 1 day. After that I started see snapshots along with elasticsearch in data folders. I know for sure that nobody created a snapshot repo.

I have attached my .yml file config in the post. Is this expected behaviour? Tell me if you need more details.


(B Zen) #4

This is what I get when I do GET /_snapshot/ I have no clue why this is happening.

{
  "quickrun": {
    "type": "fs",
    "settings": {
       "max_restore_bytes_per_sec": "1gb",
       "location": "/data/snapshots/quickrun_repo",
       "max_snapshot_bytes_per_sec": "1gb"
     }
  }
}

One more thing I have _ttl enabled in Elastic Search. Using a default of 6h. @warkolm can you help me with this?

This is my template used for logstash logs,

{
"logstemplate": {
  "order": 0,
  "template": "logstash-*",
  "settings": {
     "index.refresh_interval": "5s"
  },
  "mappings": {
     "_default_": {
        "dynamic_templates": [
           {
              "message_field": {
                 "mapping": {
                    "index": "analyzed",
                    "omit_norms": true,
                    "type": "string"
                 },
                 "match_mapping_type": "string",
                 "match": "message"
              }
           },
           {
              "string_fields": {
                 "mapping": {
                    "index": "analyzed",
                    "omit_norms": true,
                    "type": "string",
                    "fields": {
                       "raw": {
                          "index": "not_analyzed",
                          "ignore_above": 256,
                          "type": "string"
                       }
                    }
                 },
                 "match_mapping_type": "string",
                 "match": "*"
              }
           }
        ],
        "_ttl": {
           "enabled": true,
           "default": "6h"
        },
        "properties": {
           "geoip": {
              "dynamic": true,
              "properties": {
                 "location": {
                    "type": "geo_point"
                 }
              },
              "type": "object"
           },
           "@version": {
              "index": "not_analyzed",
              "type": "string"
           }
        },
        "_all": {
           "enabled": true,
           "omit_norms": true
        }
     }
  },
  "aliases": {}
}

(Colin Goodheart-Smithe) #5

If a snapshot repository exists in your cluster someone must have added it. Elasticsearch does not add snapshot repositories automatically. Did you change your cluster name from the default (elasticsearch)? Is your cluster open to internet access? Both of these put your cluster in a vunerable state and leave it open to unintentional or malicious attacks. See https://www.elastic.co/blog/scripting-security for more details on basic security for your cluster.


(Jason Wee) #6

just out of curiosity, you have 50% for the index_buffer_size, have the node encounter oom before? how many es nodes do you have in the cluster? what's the use case like?


(B Zen) #7

I have 4 es nodes. I am using logstash and elasticsearch for processing logs at the rate of 1gb every 5 minutes.


(B Zen) #8

@colings86 I changed my cluster name to QuickrunCluster. Es clusters are in a secure network. I don't think they are vulnerable to attack. I deleted the snapshot created and removed the snapshot template. But it keeps on getting created. Thanks!


(Mark Walkom) #9

Again, check you logs!

Also don't use TTL if you are using time based indices, it's a waste.


(B Zen) #10

@warkolm I have updated the details in the post. The logs are what I got. One weird thing is everytime I delete snapshot, it gets created after an hour. Can you please help me debug this?

Also regarding you second comment, can you please elaborate that. Thanks for the help.

This is the snapshot template that keeps on getting created,

{
"quickrun": {
  "type": "fs",
  "settings": {
     "max_restore_bytes_per_sec": "1gb",
     "location": "/data/snapshots/quickrun_repo",
     "max_snapshot_bytes_per_sec": "1gb"
   }
 }
}

(Mark Walkom) #11

What is quickrun, chances are it's something some has setup to create this so it might pay to ask your colleagues.
I'd also suggest putting either Shield in place to log what is happening, or some other reverse proxy.

If you are using time based indices then why bother with TTL? You can just delete the index after the period it needs to exist for. TTL is inefficient as ES has to constantly scan all your documents to see if a TTL has been reached.


(B Zen) #12

quickrun is the project I am working on. I am currently managing these clusters. I am absolutely sure that no one else is doing this. It keeps happening again and again. I tried deleting everything and starting it again. It doesn't work. Please help me debug this. Also I run these machines in VPN.
Please help me solve this, I am so frustrated by this issue. Thanks


(Mark Walkom) #13

Did you try my suggestions? Cause something external is going this and that'd be how you can find out what.


(B Zen) #14

I don't have shield plugin, but I am running elasticsearch and logstash in a local machine(which is not connected to internet) to see if the same behavior happens. Is there any reason this can happen?


(Mark Walkom) #15

As has been mentioned a number of times before, this will only happen if a request is made to ES from an external source.


(B Zen) #16

Thanks I found that somebody had started a cron job for snapshot. I am not sure why I couldn't see this before. Thank you all for bearing with me.


(system) #17