Hi everyone,
I've had a very big problem recently. When I used the has_child
query, it took a few minutes.
Here is the detailed information:
About elasticsearch configuration:
es version: 5.2.2
total documents:100 billion
total data cost storage:30 T
index number:20 (search by alias)
shard number:1700
node number:15(each has 30G heap memory)
About mappings:
**I have a parent type, it has one million documents** and each field like this:
"field_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Then, I have about 20 child types, and each field like above one. But the number of child document is very very very large (100 billion minus number of parent type).
So , when I query parent type like this, it will cost a few minutes:
{
"query": {
"constant_score": {
"filter": {
"has_child": {
"child_type": "app_signature",
"query": {
"constant_score": {
"filter": {
"term": {
"app_signature_notafter.keyword": "Tue Dec 03 16:52:52 CST 2041"
}
}
}
}
}
}
}
}
}
How can I improve query performance ? Add node or shard ? And is there a way to reduces query time to less than 10 seconds ?
Thank you very much!