Hi,
I have designed following scenario for ElasticSearch, but I need to answer
some questions regarding memory allocation and replicas.
Scenario
I will have ES cluster composed of N nodes (will start with just 1, but
could be up to 10 in future). Each node will have available sufficient
amount of memory (like 10GB-30GB). I have two following indices.
Index no. 1: Just few GB of data (probably no more than 3GB in any time,
no more than million documents), few new documents per minute (can be
manually regulated, doesn't depend on users behaviour). Lots of reads on
this index (probably 80% and more percent of the estimated read-load on ES
cluster).
Index no. 2: User depended data (routed by user_id into single shard).
Order of magnitude more data than Index no. 1. Much more writes (can't
estimate as now, but probably hundreds per minute, thousands in worst case
scenario), but much less reads. Moreover, only several active users will
access their data in short period of time. Every read will be routed to
single user_id. I can have arbitrary number of shards and replication
factor can be even 0, as I can reindex missing data any time before next
demand.
Questions
-
Is it good idea to have index no. 1 in just one shard and replicated to
every node in cluster (maybe not all, but definitely those, who will handle
HTTP requests - done via node tag && include in index setting). -
Sometimes, I will need to do bulk update on index no. 1 (add integer
value into array of single column in thousands of documents). Is it
possible with so many replicas? Or are there gotchas? -
When I read data via user_id (from index no. 2) is whole shard accessed
or only the required part? -
If a shard of index no. 2 exceeds available memory and read is executed,
does it knock out index n. 1 from memory or does ES keep it there? -
Should I set index no. 1 as memory type (after I have more nodes in
cluster) to have it all the time in memory?
Thanks for every single answer.
Ales
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.