ES heavy write & index creation

Hi,

Recently i was searching about a Real time analytic tool, and i found ES.

Basically we want to show aggregate information for multiple customers. And
this information have to be showed in real time. The information expire
then one month. basically we want to use aggregation for analytics.

Documents has around 5 to 10 fields, basically are all integers. and
integer arrays. We want to tried index around 120M documents every day (it
can increase), i was doing a POC with write 1000 docs/s, using ES java API
with bulk insert, obviously the performance decrease with every write.

The POC was very simple a Developer Machine Intel Core i5 250SSD, 16GB RAM
(xmx=8). I use similar
configuration: https://gist.github.com/reyjrar/4364063

The estructure for indexing is : http://localhost:port/category/campaign_id/

We want to deploy a cluster in EC2 for this POC, someone has some advice
about:

  • EC2 instance to take, how many?
  • Strategy of replica, numbers of shards?
  • Estructure to index documents, (we was thinking create a index for each
    campaign_id)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f532e5d-ef36-47df-b5ec-c0a8d2116d84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.