Performance of has_child query in large amounts of data

9ian1i · January 11, 2018, 5:03am

Hi everyone,
I've had a very big problem recently. When I used the has_child query, it took a few minutes.
Here is the detailed information:

About elasticsearch configuration:
es version: 5.2.2
total documents：100 billion
total data cost storage：30 T
index number：20 （search by alias）
shard number：1700
node number：15（each has 30G heap memory）

About mappings:

**I have a parent type, it has one million documents** and each field like this:

"field_name": {
	"type": "text",
	"fields": {
		"keyword": {
			"type": "keyword",
			"ignore_above": 256
		}
	}
}

Then, I have about 20 child types, and each field like above one. But the number of child document is very very very large (100 billion minus number of parent type).
So , when I query parent type like this, it will cost a few minutes:

{
  "query": {
	"constant_score": {
	  "filter": {
		"has_child": {
		  "child_type": "app_signature",
		  "query": {
			"constant_score": {
			  "filter": {
				"term": {
				  "app_signature_notafter.keyword": "Tue Dec 03 16:52:52 CST 2041"
				}
			  }
			}
		  }
		}
	  }
	}
  }
}

How can I improve query performance ? Add node or shard ? And is there a way to reduces query time to less than 10 seconds ?
Thank you very much!

system · February 8, 2018, 5:03am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance penalty for has_child queries Elasticsearch	15	1898	July 6, 2017
Very slow has_child query for large index Elasticsearch	15	1511	July 6, 2017
Has_child / has_parent queries for a large DB Elasticsearch	3	385	July 6, 2017
Parent-child performance issue Elasticsearch	7	4987	July 5, 2017
Parent-child query speed Elasticsearch	2	848	July 5, 2017

Performance of has_child query in large amounts of data

Related topics