For some background, our usage of Elasticsearch is for an index of real estate transactions. Each transaction is added once and then updated 3-4 times (as agents are added and a few other specific actions that trigger a reindex). They're kept in the system indefinitely, so that they can be searched for in many contexts (even years later). Transactions are added every minute of the day (since we have many companies adding them).
Since we aren't writing hundreds of times per minute like some kind of realtime logging usage, etc., and since we retain data indefinitely (meaning we have millions of objects) our primary concern is more about search / read performance. Also, as mentioned in my question here we are at times using some less efficient queries for objects that don't take advantage of some of the core tokenizing features of Elasticsearch.
So my question is, given the above scenario, does it make more sense to start with more small nodes (e.g., 6 nodes w/ 2 CPU cores / 4 GB of RAM) or fewer large nodes (e.g., 3 nodes w/ 4 CPU cores / 8 GB of RAM)? After asking this question I'm more inclined to start with a larger number of smaller nodes, so that scaling is simpler (e.g., starting with 6 small nodes, that can all later be doubled in size (e.g., from my stated 2CPU cores / 4 GB of RAM to 4 CPU cores / 8 GB of RAM), without the need to split indices, etc.