While testing with large amount of data I am reaching the point of first
performance issues. The initial situation as follows:
- one ES node with 8GB heap assigned
- one index with 110.000.000 documents
- 78.000.000 docs assigned to single _type
- histogram data and a sub-type of cardinality 20
- histogram query using aggregation over sub-type runs fast (< 3 seconds)
- histogram over whole index,_type but ignoring subtype run up to 50
seconds (index is cold), on warm index the same query takes 10-12 seconds - there are currently no writes to index and index is optimized (this may
change in future) - only one shard of size 30GB
- one index per month
- data for about 3-4 month into past
- java 1.7u55 and es 1.4.1
My requirements:
- query should return in <3 seconds
- one index per month (or probably week)
- continuous adding new data to recent index
Questions:
- How to find out the bottleneck of this query?
- What are the tuning options?
- Over time there are serious heap issues: the heap grows up and many time
is spent in parallel+full gc. After restarting the used heap is about 3GB
and several GCs will hold it on this level. But over hours the usage grows
up towards 8GB and full gc is not able to cleanup here. A restart is
required. Why?
regards,
markus
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fac725cd-d6f6-4fc2-b274-4af374695d82%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.