Thought to myself: ES does great for making queryable data, but one of the issues with things like Docker is that you need to commonly reingest the dataset in order for a docker instance of elasticsearch to be be stood up and be functional.
I said to myself "Well, that just means I can use volumes then, I could link a preexisting database with an elasticsearch docker container." This can just be done from within Compose or the docker run -v command. So ideally, if i need to query something from within Elasticsearch, I could launch a docker container, link this existing database, then after it finishes initializing, I can just query the 9200 endpoint.
This sounds great if the purpose is a single ingestion and then only need to spin something up when you want to query it.
Has anyone ever done something like this? It sounds doable to me. My issue that I have run into usually around HDD access issues when attempting to access something outside the regular system.
Mac: Docker fails to mount volumes at /Volumes/
Linux (Debian 9): Docker fails to mount to /media/sf_
Windows 10 Home: Docker fails to mount to H:/
*Note: All of these docker instances will work on a HOST folders, but not additional drives or mounted External HDD.
All of these have 755 limitations for perms. Since what I was thinking doesnt sound unreasonable, how have others approached this problem?
I noticed that with a Raspberry Pi, it works with an external HDD plugged into it, it writes slowly and eventually is unable to process anything due to a ram issue where it can never allocate enough ram and just perma waits.
It seems like given that information, It seems that I may need not use docker at all for ingestion process at least, to be able to have access these external volumes. What kind of recommendations do you all have?
Seems like my issue is docker related, but i thought i could set the user with that in order to get the access I want. That seems to be incorrect.