Elasticsearch cluster / nodes / shards conf

It will take some experimentation and tuning to be sure that your setup has the performance characteristics you need. You might like to try importing increasingly large subsets of your log file first to get a handle on the performance characteristics and make sure that your mappings are set up suitably for the searches you want to perform.

If the index will eventually be 400GB then this article suggests you will want to split it into around 10-20 shards. However if you do not need all the fields to be indexed then you might find your index becomes much smaller than the source data, so you will be able to work with correspondingly fewer shards.

If your searches will be filtering on time ranges (common for searches of log data) then you might want to consider splitting the data into time-based indices rather than putting it into a single index with lots of shards. Using time-based indices will allow Elasticsearch to completely avoid searching any shards which it knows in advance do not contain any documents that match the time range specified in the search.