Hello,
I have the Elasticsearch index with 2 Billion documents.
And my Elasticsearch cluster has 2 data node with 32GB memory each.
The index looks like below,
{
"mappings": {
"properties": {
"userName": {
"type": "keyword"
},
"productName": {
"type": "text",
"analyzer": "korean_nori_analyzer",
"fields": {
"standard": {
"type": "text",
"analyzer": "standard"
}
}
}
}
},
"settings": {
"index": {
"analysis": {
"tokenizer": {
"nori_user_dict": {
"type": "nori_tokenizer",
"decompound_mode": "mixed"
}
},
"analyzer": {
"korean_nori_analyzer": {
"type": "custom",
"tokenizer": "nori_user_dict"
}
}
},
"number_of_shards": 10
}
}
}
And I used 'userName' field for _routing value. So when I want to find specific user's data, I can find it in only one shard.
Each shard size are 20~30GB, so I think that is no problem.
But when I took GET _search?routing=user1 operation on my index, it took more than 5 seconds.
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "삼다수",
"fields": ["productName", "productName.standard"]
}
}
],
"filter": [
{
"term": {
"userName": "user1"
}
}
]
}
}
}
Although I removed query context, query time easily surpassed 5 seconds.
{
"query": {
"bool": {
"filter": [
{
"term": {
"userName": "user1"
}
}
]
}
}
}
How can I tune performance for my Elasticsearch index?
I want to get data within 1 seconds.
Is this query time natural for 2 Billion data?
And how about increasing shard number 10 to 100? I guess that the more shard number, the faster performance, because I can query on only one shard.
Thanks.