Where data is stored?

Hello.
When I send my Windows Event Logs using "Winlogbeat" directly to "Elastic" then where is my data stored? I mean is something like file.

Thank you.

It's stored in Elasticsearch, where depends on how you installed it - https://www.elastic.co/guide/en/elasticsearch/reference/5.5/install-elasticsearch.html

I installed it via "yum" command on CentOS. where is the correct directory?

The directory is mentioned on this page https://www.elastic.co/guide/en/elasticsearch/reference/5.5/rpm.html

You mean is something like:

/var/lib/elasticsearch/nodes/0/indices

But they are not human readable!!!

No, why does that matter though? Never interact with the files that Elasticsearch creates directly on the filesystem, always use the APIs.

For backup?

You shouldn't use file backups for Elasticsearch, but rather the snapshot and restore APIs. This can be done by way of other tools, like Elasticsearch Curator.

The primary reason to not use a file-type backup approach is that the data would very likely be corrupted. The Lucene data structures in the should-never-be-touched data paths are in constant change so long as indexing is going on. If one file were backed up while another were changing, then there would be a mismatch, and corruption would ensue. Your file-based backup would be worthless.

2 Likes

Thank you so much for your info. I have some questions:
1- In "Repository" section, Code must be written in "Dev Tools" ?
2- If I don't like to work with "Dev Tools" then I must read "Shared File System Repository" and change "elasticsearch.yml" configuration?
3- both are same?
4- How about restore them from file via config file?

I see two parameters path.repo: ["/mount/backups", "/mount/longterm_backups"] What is "/mount/longterm_backups" and is it mandatory?

I added below line to my Elasticsearch configuration and restart "elasticsearch" service but no file created:

path.repo: ["/var/log/back","/var/log/back-long"]

I can see another option in my configuration file "#path.data: /path/to/data"!!!!
Tnx.

If you're creating the repository that way, then yes. Otherwise there's a rather undocumented tool included with Curator called es_repo_mgr which allows you to create fs and s3 repositories at the command line. Just run es_repo_mgr --help and see the options you can use, which should mirror the ones in the online example.

You must configure the shared filesystem repository (type fs) in both places. It must be done in the API and with path.repo in elasticsearch.yml.

Restore is done using the API, as linked above, or the Restore action in Curator, which does use a YAML configuration file.

That does not create a path. It merely tells Elasticsearch that use of that path is acceptable. The path must still be added via the API (or a tool like the aforementioned es_repo_mgr, which does the API calls for you).

API? Can you give me an example?

The examples are in the snapshot and restore link I mentioned previously, but here's a link to one:

PUT /_snapshot/my_backup
{
  "type": "fs",
  "settings": {
        ... repository specific settings ...
  }
}

Excuse me, I can't understand ... repository specific settings ... !!!
I like to create a backup of "server1-" and "server2-".

# curl -XGET 'http://localhost:9200/_cat/indices?v'
health status index              uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   server2-2017.08.28 8KF2JptfS9q6qNgsscqDLQ   5   1          6            0    121.2kb        121.2kb
yellow open   server1-2017.09.01 hQl7hSIfRPmbL6wBjqSwpw   5   1         17            0      217kb          217kb
yellow open   server1-2017.08.28 00yeQwotQj-s3ZEHXvM-SA   5   1          2            0     40.2kb         40.2kb
yellow open   .kibana            qx6nf2-4Q9O7jU_ron7eNw   1   1          3            0     23.3kb         23.3kb

My elasticsearch config is as below:

# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
path.repo: ["/var/log/back","/var/log/back-long"]
#
# Path to log files:
#
path.logs: /var/log/elastic
#

I'm thankful if you show me a good code.

Thank you.

That's not how snapshots work. You snapshot selected indices from all nodes (the entire cluster), or not at all.

Did you visit the link I sent already? It has examples which show location and compression as potential options for settings:

        "location": "/mount/backups/my_backup",
        "compress": true

I'm a bit concerned that you're defining your path.repo as /var/log/back or /var/log/back-long. Are these shared file systems that just happen to be mounted in /var/log? If not, then you will not be able to create a snapshot repository. A snapshot repository must be a shared filesystem, like NFS, to which each master and data node has read and write access.

Thank you.
It just a test and I know "/var/log" is not a good location. I like to create a backup and then remove server and restore my backup.
I read the link but I'm a beginner and... I don't know what is ... repository specific settings .... is it parameters or...
I'm thankful if you provide the commands.

Can ... repository specific settings ... be:

"compress": true,
"location": "/mount/backups/my_backup"

?

Yes. Exactly.

Is "location" vs path.repo in my config file?

Yes, location should match one of the entries in path.repo. path.repo is where you tell Elasticsearch that it is acceptable to use that given mount point as a location for a repository. It is a hard, config-file based whitelist.

I have written:

PUT /_snapshot/my_backup
{
  "type": "fs",
  "settings": {
        "compress": true,
		"location": "/var/log/back"
  }
}

and I got:

{
  "error": {
    "root_cause": [
      {
        "type": "exception",
        "reason": "failed to create blob container"
      }
    ],
    "type": "exception",
    "reason": "failed to create blob container",
    "caused_by": {
      "type": "access_denied_exception",
      "reason": "/var/log/back/tests-CHKsf_FGS2muTXMF555buA"
    }
  },
  "status": 500
}

Why?