Hi!
We are trying to upgrade our deployment from an Elastic Heroku-addon (Premium 3) to a new deployment on elastic.co, using the I/O optimized general purpose setup.
Old cluster: 8GB RAM + 192GB storage * 1 node * 2 zones
New cluster: 8GB RAM + 240GB storage * 1 node * 2 zones
The old deployment is running 6.8.6, and the new one is running 7.6.2.
Apart from the version upgrade, we’ve made two changes:
- We’ve updated our mapping from
{
event: {
properties: {
action: { type: :keyword },
actor_id: { type: :long },
actor_type: { type: :keyword },
authenticated_by_id: { type: :long },
authenticated_by_type: { type: :keyword },
code: { type: :keyword },
created_at: { type: :date },
id: { type: :long },
message: { type: :text, index: false },
object_id: { type: :long },
object_type: { type: :keyword },
owner_id: { type: :long },
owner_type: { type: :keyword },
partition: { type: :long },
previous_event_id: { type: :long },
reference_id: { type: :long },
reference_type: { type: :keyword },
success: { type: :boolean }
}
}
}
to
{
dynamic: "strict",
properties: {
action: { type: :keyword },
actor_id: { type: :keyword },
actor_type: { type: :keyword },
authenticated_by_id: { type: :keyword },
authenticated_by_type: { type: :keyword },
code: { type: :keyword },
created_at: { type: :date },
details: { type: :object, enabled: false },
id: { type: :keyword },
message: { type: :text, index: false },
object_id: { type: :keyword },
object_type: { type: :keyword },
reference_id: { type: :keyword },
reference_type: { type: :keyword },
root_id: { type: :keyword },
root_type: { type: :keyword },
sequence: { type: :keyword },
success: { type: :boolean }
}
}
So we added dynamic: strict
, plus we renamed a few of the fields (owner -> root
, partition -> sequence
). I'm assuming the only potentially interesting thing here is that we changed some long
properties to keyword
. We did that based on the recommendation in https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html. We're only interested in doing exact matches on these properties, so term queries.
- We applied a sort on the index settings:
index: {
"sort.field": ["created_at", "id"],
"sort.order": ["desc", "desc"]
}
We did this because of the advice in https://www.elastic.co/guide/en/elasticsearch/reference/master/index-modules-index-sorting.html. Our queries always sort on these properties, and our data is a time series.
All in all, we have approximately 600 million documents in the index in the old deployment, which we re-index to the new deployment.
Switching our production application to query this new cluster, with the exact same query as before, apart from the renamed owner_id
and owner_type
, left us with quite a big drop in performance. Hoping someone here can shed some light on what we're doing wrong.
Here's two screenshots from Kibana from the old vs new cluster. Now, I don't know that much about ES, but one thing that stands out to me is Query Cache is much bigger in the old cluster. Is it possible we need to run the new cluster for a significant amount of time to build it up?
The query
First let me explain what we want to do. We want to query the documents, allowing for exact match filtering on:
action
actor_id
actor_type
object_id
object_type
root_id
root_type
reference_id
reference_type
In addition, we want to allow filtering with a range on created_at
.
Typically, only these are used:
root_id
root_type
reference_id
reference_type
The current query we have that does this looks something like this:
{
"query": {
"bool": {
"filter": [
{ "bool": { "must": { "terms": { "root_id": [ "123456" ] } } } },
{ "bool": { "must": { "term": { "root_type": "Place" } } } },
{ "bool": { "must": { "term": { "reference_type": "Place" } } } },
{ "bool": { "must": { "terms": { "reference_id": [ "123456" ] } } } }
]
}
},
"sort": [
{ "created_at": "desc" },
{ "id": "desc" }
]
}
And here is one with a range query:
{
"query": {
"bool": {
"filter": [
{ "bool": { "must": { "terms": { "root_id": [ "123456" ] } } } },
{ "bool": { "must": { "term": { "root_type": "Place" } } } },
{ "bool": { "must": { "term": { "reference_type": "Place" } } } },
{ "bool": { "must": { "terms": { "reference_id": [ "123456" ] } } } },
{ "bool": { "must": { "range": { "created_at": { "gte": "2020-04-01", "lte": "2020-05-01" } } } } }
]
}
},
"sort": [
{ "created_at": "desc" },
{ "id": "desc" }
]
}
Both queries perform better in the old cluster.
- Does anything here look like it would perform slower in ES 7.6 compared to 6.8?
- Is there a better query for achieving what we want that would perform better in ES 7.6?
Thanks!