Hello, I am new here and I am trying to understand performance issues.
I am running an index on two different deployments of Elasticsearch. The index is the same (I've used the backup/restore operation to make sure I have the exact same database contents) on both sides.
One of the elasticsearches is running on EC2, with "as much" RAM (up to 64) and CPU it can eat. The other is deployed with the ECK operator on a node that can have up to 16 GiB or RAM (I tried giving it 32, but it failed...)
EC2 has 0 master and 1 data (no other replica), a single shard.
The ECK instance has 1 master and 1 data (they are deployed on the same node). The restore operation, to create the index, created the same single shard.
I am now shooting(*) /_search requests on both, and I observe a difference in the processing time - the EC2 instance being a lot faster than the ECK one. I am trying to understand what's going on and how to have them have the same efficiency. The difference I observe is around +200ms for the ECK instance vs the EC2 instance, on _search requests (see below for an example).
(*) in order to compare results, I port-forward my EC2 port and I port-forward my ECK http service to my local machine.
It is important to note that, right now, the goal is not to get the BEST EVER result, but to get, at least, a similar result. I know my index could benefit from sharding, but I want to compare two comparable things - EC2 vs ECK, 1 data node, no replica, 1 shard.
The index is 460 million entries, around 100 GiB of data (so, yes, I should definitely be using 4 shards, I know).
I don't know what to look for, and will be happy to post responses from ES to your suggested queries.
Here are the descriptions of a few important sections :
▶ curl -k http://127.0.0.1:9200/library | jq // this is the same index for both ES
{
"library": {
"aliases": {},
"mappings": {
"properties": {
"author": {
"type": "wildcard"
},
"book_registered_title": {
"type": "keyword"
},
"isbn": {
"type": "keyword"
},
"internal_id": {
"type": "keyword"
},
"book_title": {
"type": "wildcard"
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "library",
"creation_date": "1657643120070",
"sort": {
"field": "internal_id",
"order": "asc"
},
"number_of_replicas": "1",
"uuid": "QOTS5xzhSMe7KvGEUmifmQ",
"version": {
"created": "7090099"
}
}
}
}
}
Here is the operator I used:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: pascal
namespace: es-library-staging
spec:
version: 7.17.5
nodeSets:
- name: data
count: 1
config:
node.roles: [data]
node.store.allow_mmap: false
cluster.initial_master_nodes:
- pascal-es-library-master-0
xpack.security.authc.realms:
native:
native1:
order: 1
podTemplate:
metadata:
labels:
dedicatedLabel: "es-library"
spec:
tolerations:
- key: "dedicatedTaint"
operator: "Equal"
value: "es-library"
effect: "NoSchedule"
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
elasticsearch.k8s.elastic.co/cluster-name: pascal
topologyKey: kubernetes.io/hostname
containers:
- name: elasticsearch
resources:
requests:
cpu: "125m"
memory: 2Gi
limits:
cpu: 4
memory: 16Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 250Gi
- name: master
count: 1
config:
node.roles: [master]
node.store.allow_mmap: false
cluster.initial_master_nodes:
- pascal-es-library-master-0
xpack.security.authc.realms:
native:
native1:
order: 1
podTemplate:
metadata:
labels:
dedicatedLabel: "es-library"
spec:
tolerations:
- key: "dedicatedTaint"
operator: "Equal"
value: "es-mlc"
effect: "NoSchedule"
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
elasticsearch.k8s.elastic.co/cluster-name: pascal
topologyKey: kubernetes.io/hostname
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
resources:
requests:
cpu: "100m"
memory: 1Gi
limits:
cpu: 2
memory: 2Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Example of a query:
curl -k http://127.0.0.1:9201/library/_search -H "content-type:application/json" -d'{
"post_filter": {
"bool": {
"should": {
"wildcard": {
"author": {
"value": "*VICTOR HUGO*"
}
}
}
}
},
"query": {
"bool": {
"should": {
"term": {
"book_registered_title": "NOTRE-DAME DE PARIS"
}
}
}
},
"search_after": [
""
],
"size": 10000,
"sort": [
{
"internal_id": {
"order": "asc"
}
}
]
}'
On ECK, I get the following results :
{
"took": 626,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 74,
"relation": "eq"
},
"max_score": null,
"hits": [...]
}
}
and on EC2 I get this :
{
"took": 143,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 74,
"relation": "eq"
},
"max_score": null,
"hits": [...]
}
}
Do you have a suggestion to help me investigate why one of them is so slower than the other ?