Hello every body,
First of all I would like to thank you for your help.
My problem is about query slow response. I have read a lot in the forum about the cluster configuration, shards and so on, and this is finally my Cluster configuration:
- 1 Dedicated Master Node. Let’s call Node 1.
- 2 Nodes dedicated for Ingest/Write and store Data. Let’s call Node 2 & Node 3.
- 2 Coordinate Node that can be elected as Master and for query purpose, also to run kirbana. Let’s call Node 4 & Node 5.
- 1 Node with Data Only & Query “back up”. Let’s call Node 6.
- 1 Node with Ingest/Write and store Data as “back up” Let’s Call node 7.
So when I make write queries (We are doing with NEST API on C# program), we do on our test from 2 different instances, with this pool of connection:
- Instance 1:
- Node 2.
- Node 7.
- Node 3
- Instance 2:
- Node 3.
- Node 7.
- Node 2.
Each node has the same configuration W10, Intel i5, 8GB(September will be 32GB), 256GB SSD.
Versions
- elasticsearch 7.8
- kirbana 7.8
We are having good results for us on write (amount 20K of documents stored per second).
Our Index & Mapping has been created with this pattern:
PUT /pocIndex
{
"settings": {
"number_of_shards": 4,
"index.auto_expand_replicas": "0-all",
"refresh_interval": -1,
"index.requests.cache.enable": true,
"index.max_result_window": "1000000",
"sort.field" : "date",
"sort.order" : "desc"
},
"mappings": {
"properties": {
"date": {
"type": "date",
"store": true
},
"topic_id": {
"type": "keyword",
"store": true,
"norms": false
},
"topic": {
"type": "keyword",
"store": true,
"norms": false
},
"tag_id": {
"type": "keyword",
"store": true,
"norms": false
},
"tag_position": {
"type": "keyword"
},
"device_criteria": {
"type": "keyword",
"index": false
},
"device_type": {
"type": "keyword",
"index": false
},
"job_id": {
"type": "keyword",
"store": true,
"norms": false
},
"value": {
"type": "float",
"store": true,
"index": false
},
"trigger_value": {
"type": "integer",
"store": true,
"index": false
}
}
}
The data we are storing is about 7 million documents per day.
So here comes the problem, on our user case we have to make two kinds of queries:
- Get All the mount data of one hour.
- Getting a specific topic in a date range.
Queries have been done with kirbana.
Don’t know how to manage this because as you can see there is no a lot to configure with first query, but with this I’m having a response of 16-20 seconds with a hits value of 279612886, but if I Change the date to a minor range I have a 5-10 seconds response with a hits values of 270755, In my opinion both results are too slow, I expected to have more or less 7 million documents in 1-2 seconds, am I crazy????:
GET / pocIndex /_search
{
"size": 300000,
"_source": ["topic", "tag_id","job_id","value","date"],
"track_total_hits": true,
"query":{
"bool":{
"filter": [
{
"range": {
"date": {
"gte": "2019-07-20T00:00:00",
"lte": "2020-07-21T02:00:00"
}
}
}
]
}
}
}
Also I have another issue when I make this query, that I can’t understand:
> GET /_cat/thread_pool/search?v&h=node_name,name,active,rejected,completed
node_name name active rejected completed
Node 4 search 0 0 0
Node 2 search 0 0 46139
Node 5 search 0 0 50812
Node 6 search 0 0 193370
Node 1 search 0 0 74995
Node 7 search 0 0 154707
Node 3 search 0 0 81846
How can be possible Node 4 has nothing when the purpose is to route queries? And Node 1, why has queries if In my app I never send anything to him and is not in the pool connection??