Import data from folder

Hello.
This is my first post, and first time using elasticsearch.

I've been given an elasticsearch dump, a large tar.gz tar expands to a folder with ~50 subfolders that mostly come in pairs (e.g. one named video alongside on named video-reference).
Inside these again a find many more folders, importantly among them always one _mapping that holds a file named null (no suffix) that holds what I'm pretty sure is the schema (I recognise it from the tutorial - load dataset). Siblings to these _mapping folders are many many more folder that only contain one file each, always named _source (no suffix again) that contain a JSON document.

I've set up elasticsearch and kibana on my local machine, and would like to explore this data I've been given.

Do you recognise the folder-structure and files mentioned above? I assume it's from some standard export tool/dump procedure/etc. I would like to import this data, any hints on where I should start?

Could you perhaps post an example of the _source file and _mapping directories?

It sounds like you could use logstash's file input to read the file and send it to Elasticsearch, but I'm having difficulty following the structure and content of the files and directories.

Ok.
Here's an example of a _source file.

{"remoteId":"367783173981106176","uRL":"https://twitter.com/ingunnsolheim/statuses/367783173981106176","metadata":{"versionId":"1.11182854.1376523439","state":"PUBLISHED","publicState":"PUBLISHED","versionPublished":"2013-08-15T01:37:20+02:00","modifier":"18.13156","publicVersionId":"1.11182854.1376523439","owner":"18.13156","_type":"Metadata"},"version":1376523439,"title":"https://twitter.com/ingunnsolheim/statuses/367783173981106176","modified":"2013-08-15T01:37:20+02:00","id":"1.11182854","published":"2013-08-15T01:37:20+02:00","_type":"Tweet"}

The file I assume is the corresponding schema file looks like this (file named null):

{"tweet":{"dynamic_templates":[{"long_text_strings":{"mapping":{"index":"analyzed","similarity":"default","analyzer":"norwegian","type":"string"},"match":"abstract|body|instructions|text|data","match_mapping_type":"string","match_pattern":"regex"}},{"short_text_strings":{"mapping":{"index":"analyzed","similarity":"BM25","analyzer":"norwegian","type":"string"},"match":".*title|.*Title|tittel|beskrivelse|caption|description|flip|footer|information|intro|lead|usage|searchWords","match_mapping_type":"string","match_pattern":"regex"}},{"other_strings":{"mapping":{"index":"not_analyzed","type":"string"},"match":"*","match_mapping_type":"string"}}],"_timestamp":{"enabled":true,"store":true},"properties":{"_type":{"type":"string","index":"not_analyzed"},"id":{"type":"string","index":"not_analyzed"},"indexedAt":{"type":"date","format":"dateOptionalTime"},"metadata":{"properties":{"_type":{"type":"string","index":"not_analyzed"},"firstVersionId":{"type":"string","index":"not_analyzed"},"id":{"type":"string","index":"not_analyzed"},"modelType":{"type":"string","index":"not_analyzed"},"modifier":{"type":"string","index":"not_analyzed"},"owner":{"type":"string","index":"not_analyzed"},"previousVersionId":{"type":"string","index":"not_analyzed"},"publicState":{"type":"string","index":"not_analyzed"},"publicVersionId":{"type":"string","index":"not_analyzed"},"state":{"type":"string","index":"not_analyzed"},"version":{"properties":{"_type":{"type":"string","index":"not_analyzed"},"created":{"type":"date","format":"dateOptionalTime"},"id":{"type":"string","index":"not_analyzed"},"modelType":{"type":"string","index":"not_analyzed"},"published":{"type":"date","format":"dateOptionalTime"},"version":{"type":"long"}}},"versionId":{"type":"string","index":"not_analyzed"},"versionPublished":{"type":"date","format":"dateOptionalTime"},"versions":{"properties":{"_type":{"type":"string","index":"not_analyzed"},"created":{"type":"date","format":"dateOptionalTime"},"id":{"type":"string","index":"not_analyzed"},"updated":{"type":"date","format":"dateOptionalTime"},"version":{"type":"long"}}}}},"modelType":{"type":"string","index":"not_analyzed"},"modified":{"type":"date","format":"dateOptionalTime"},"plug":{"properties":{"_type":{"type":"string","index":"not_analyzed"},"modelType":{"type":"string","index":"not_analyzed"}}},"published":{"type":"date","format":"dateOptionalTime"},"remoteId":{"type":"string","index":"not_analyzed"},"title":{"type":"string","analyzer":"norwegian","similarity":"BM25"},"uRL":{"type":"string","index":"not_analyzed"},"updated":{"type":"date","format":"dateOptionalTime"},"version":{"type":"long"}}}}

The folder structure then looks like this:

Does this still sound like a job for "logstash's file input"? Can you point me towards some material on this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.