Hi,
I am currently using AWS ES 6.2 for the percolator. I have a dedicated cluster for the percolator which has a single index that contains ~16,000 percolator queries (~30MB).
I have about 10000 percolator query invocations to the cluster per minute. With this, I am experiencing a max CPU of about 80%.
Here's my cluster configuration:
Instance type: c4.4xlarge.elasticsearch
Nodes: 20 data + 3 master
Shards: 8 Primary + 9 Replica
Sample percolator query:
{
"parentNode": "123",
"isActive": true,
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"query_string": {
"query": "\"great results\", \"good results\", \"excellent results\"",
"analyzer": "english"
}
}
]
}
}
]
}
}
Sample request sent to the percolator:
{
"size": 100,
"_source": False,
"query": {
"bool": {
"filter": {
"percolate": {
"field": "query",
"documents": [contains a list of 100 documents with same metafields]
}
},
"must": [
{
"terms": {
"parentNode": ['123', '234']
}
},
{
"term": {
"isActive": true
}
}
]
}
}
}
I am using the 2 filters to narrow down the candidate queries to percolate on.
Am I using the percolator wrong? The documentation states that performance of the percolator has improved a lot from earlier versions (I used AWS ES 1.5 earlier) but I only see the performance degrading. Can I get any suggestions on this?
Thank you