Please format your code using </>
icon as explained in this guide. It will make your post more readable.
Or use markdown style like:
```
CODE
```
Some thoughts:
- What is exactly the full query? I mean is the
query_string
part inside aquery
or afilter
? analyze_wildcard
: do you really intend to run queries likefoo*bar
? As per doc says it's super slow.- Do you really want to compute a bucket for every 3 hours but for the full 6 days? Don't you want to add a filter by date and just look at the last 24 hours for example?
What are the index settings? How many shards per day?
Also using _exists_:aggregate_final
is going to most likely in your use case give back all the documents. So you compute an aggregation on 3.5 billion docs most likely + the cost of running the query which could be faster with a match_all
.
One thing you can do is to run a query filtered per day and compute the agg only for that day. Then use a multisearch query to run 5 of them in parallel.
Can SSD help me?
Yes.
Should I check mapping because index size is 5 times bigger in raw size?
Yes. Remove _all
, remove non needed keyword
fields, non needed text
fields.
If you are planning to query often on the existence of aggregate_final
field, may be you should simply index that value as a boolean and filter by that.
Just some thoughts.