Fast upserts, inmemory, fast expiring data aggregations?

Hi List,

From the looks of it everything is possible but I still have some
questions. My application consist of events being upserted that expire
after 30 seconds and doing aggregations on those. I always filter on
user_id which is also the routing_value.

event_fields =
{"user_id","timestamp","tags","dimension1","dimension2","dimension3"}

Questions:

  1. 97% of queries are upserts of events that expire after 30 seconds.
    These will be bulk inserted. Since I'll always filter_by timestamp, it's no
    problem that TTL only deletes every 60seconds.
  2. Can I disable the translog/commit-log (something like postgresql
    UNLOGGED TABLE)? So if the node crashes I don't care that I'll loose all
    data because they would be expired anyway after restarting.
  3. 2% of queries will be an aggregation query which will always filter
    on "user_id", but may also filter on all the fields. Should I index every
    field ? I think I only need to index user_id and the TTL. Because it would
    be too much overhead even maintaining the indexes, because of expiring
    documents.
  4. How can I get a top-hits-aggregation on an array's elements field ?
    Basically explode/unnest the array and top-hit-aggregate the elements. Or
    is that done automatically ?
  5. There is no "exact distinct aggregation", only approximate?
  6. Is there an in-memory option ? With no disk activity ? I remember
    reading some threads that in-memory wasn't very good since it was stored on
    the java-heap ?
  7. Can I select in the mapping, that TTL uses data from the 'timestamp'
    field? So I don't have both 'timestamp' and 'ttl' as separate field-indexes.
  8. Most of the queries will only aggregate at most 100K documents,
    usually <=10K, so I think I only need to tune indexing and deleting
    performance.
  9. Should I disable compression? Since my most concern is about
    cpu-usage, and compression/decompression will slow it down.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f705396-dadc-4a87-a0bd-9317e6b9dd8d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.