We use Elasticsearch in our integration tests, which run many times every day. Running ES in this way has a different set of requirements than in our production environment. We need ES to start as fast as possible with only the exact modules/plugins that we need. Unfortunately, out of the box, ES is really slow to start. We run our stack in Kubernetes, and all other dependency containers are up within 30 seconds (redis, postgres, etc), but ES can take several minutes, and our app is stuck waiting for ES before it can come up.
We've tried a few optimizations including:
- Running a single node cluster
- Disabling GeoIP downloader by setting ingest.geoip.downloader.enabled=false
- Tweaking memory, heap, JVM sizes
- Running larger Kubernetes nodes
While some of this has helped, the startup time is still very slow (a couple minutes) for something that is running this constantly. I'm wondering if anyone out there has experience optimizing ES startup times for such purposes? I have found little documentation on this. We have a pretty basic use case (we use the ngram analyzer to enable smarter searches in our application) and I suspect we don't need a number of the modules that load out of the box, such as ingest-geoip.