Poor Query Performance using nested document structure

Jagdish_agarwal · May 7, 2018, 6:23am

We are building a search feature for our existing product . The document structure is like

{
	"_index" : "apr-2018-feed",
	"_type" : "product",
	"_id" : "8102637039f76d20a0adc1257c14ee08",
	"_source" : {
		"id" : "8102637039f76d20a0adc1257c14ee08",
		"field1" : "value",
		"field2" : 6495,
		"field3" : "",
		"field4" : "value",
		"field5" : 23922,
		"dateField" : "2018-03-02",
		"valueField" : 10000000,
		"clusters" : [{
				"clusterId" : 4919,
				"clusterName" : "XYZ",
				
				"innerClusters" : [{
						"innerClusterId" : 118760075,
						"field1" : "value",
						"field2" : 6495,
						"field3" : "",
						"field4" : "value",
						"field5" : 23922,
						"attributeStore1" : [{
								"name" : "attr1",
								"value" : "attrVal"
							}, {
								"name" : "attr2",
								"value" : "attrVal"
							}, {
								"name" : "attr3",
								"value" : "attrVal"
							}, {
								"name" : "attr4",
								"value" : "attrVal"
							}
						],
						"attributeStore2" : [{
								"name" : "attr5",
								"value" : "attrVal"
							}, {
								"name" : "attr6",
								"value" : "attrVal"
							}, {
								"name" : "attr7",
								"value" : "attrVal"
							}, {
								"name" : "attr8",
								"value" : "attrVal"
							}
						],
					},{
						"innerClusterId" : 118760076,
						"field1" : "value",
						"field2" : 6495,
						"field3" : "",
						"field4" : "value",
						"field5" : 23922,
						"attributeStore1" : [{
								"name" : "attr1",
								"value" : "attrVal"
							}, {
								"name" : "attr2",
								"value" : "attrVal"
							}, {
								"name" : "attr3",
								"value" : "attrVal"
							}, {
								"name" : "attr4",
								"value" : "attrVal"
							}
						],
						"attributeStore2" : [{
								"name" : "attr5",
								"value" : "attrVal"
							}, {
								"name" : "attr6",
								"value" : "attrVal"
							}, {
								"name" : "attr7",
								"value" : "attrVal"
							}, {
								"name" : "attr8",
								"value" : "attrVal"
							}
						],
					}
				]
			}
		]
	}
}

This is the document structure that I am using.

Document
|__
Clusters
|__
InnerCLusters
|__
AttrStore1
|__
AttrStore2

We are clustering documents based on document similarity.

We have around 17 million grouped/clustered documents and index size is 105.3 GB. Total documents as per ES is 298.8 million.

We have configured 2 data nodes m5.large (2 vCPU * 2 = 4 vCPU), ( with SSD storage) . Index with 4 shard (1 shard per CPU core) and 0 replica , segments merged (1 segment per shard)

ES configuration

bootstrap.memory_lock: true
indices.requests.cache.size: 30%
thread_pool.search.size: 50

Heap
-Xms4g
-Xmx4g

Also we did a match_all query which takes around 5 sec (with cache cleared)

Also we tried with larger instances with total of 16 vCPU and 120 GB of RAM for Elasticsearch but the performance was similar

How should we store the documents so that we query the documents under 500ms ?

dadoonet · May 7, 2018, 6:55am

Can you share your exact query?

Jagdish_agarwal · May 7, 2018, 7:04am

Ours is a complex query which we can explain if required.

Even the simple match all query is taking around 5 sec

    GET apr-2018-feed1v4/_search
    {
      "query": {
        "match_all": {}
      }
    }

Response
{
"took": 4716,
"timed_out": false,
"_shards": {
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 16843022,
"max_score": 1,
..
}
}

dadoonet · May 7, 2018, 7:15am

Can you run it with profiling on?

Jagdish_agarwal · May 7, 2018, 7:49am

I ran the profile query but due to network restrictions cannot attach file and site does not allow more than 7000 characters.

sharing the profiling info for 1 shard in 2 posts

    "id": "[KRJZ692RR26VQU9YNDhS5w][apr-2018-di-feed1v4][0]",
    "searches": [
      {
        "query": [
          {
            "type": "ConstantScoreQuery",
            "description": "ConstantScore(#*:* -_type:__*)",
            "time": "277625.2845ms",
            "time_in_nanos": 277625284467,
            "breakdown": {
              "score": 1819332110,
              "build_scorer_count": 2,
              "match_count": 37359599,
              "create_weight": 1127914,
              "next_doc": 167896909321,
              "match": 107828448875,
              "create_weight_count": 1,
              "next_doc_count": 37359600,
              "score_count": 2105306,
              "build_scorer": 2641739,
              "advance": 0,
              "advance_count": 0
            },
            "children": [
              {
                "type": "BooleanQuery",
                "description": "#*:* -_type:__*",
                "time": "138158.4556ms",
                "time_in_nanos": 138158455619,
                "breakdown": {
                  "score": 0,
                  "build_scorer_count": 2,
                  "match_count": 37359599,
                  "create_weight": 467519,
                  "next_doc": 101389480737,
                  "match": 36691674377,
                  "create_weight_count": 1,
                  "next_doc_count": 37359600,
                  "score_count": 0,
                  "build_scorer": 2113784,
                  "advance": 0,
                  "advance_count": 0
                },
                "children": [
                  {
                    "type": "MatchAllDocsQuery",
                    "description": "*:*",
                    "time": "32474.96375ms",
                    "time_in_nanos": 32474963749,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2480,
                      "next_doc": 32437595592,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 37359600,
                      "score_count": 0,
                      "build_scorer": 6074,
                      "advance": 0,
                      "advance_count": 0
                    }
                  },
                  {
                    "type": "MultiTermQueryConstantScoreWrapper",
                    "description": "_type:__*",
                    "time": "95148.66580ms",
                    "time_in_nanos": 95148665802,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2580,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 1546845,
                      "advance": 95111862080,
                      "advance_count": 35254294
                    }
                  }
                ]
              }
            ]
          },
          {
            "type": "BooleanQuery",
            "description": "_type:__Clusters _type:__Clusters.InnerCLusters _type:__Clusters.InnerCLusters.AttrStore2 _type:__Clusters.InnerCLusters.AttrStore3 _type:__Clusters.InnerCLusters.AttrStore1",
            "time": "32783.88403ms",
            "time_in_nanos": 32783884034,
            "breakdown": {
              "score": 0,
              "build_scorer_count": 2,
              "match_count": 0,
              "create_weight": 65857,
              "next_doc": 0,
              "match": 0,
              "create_weight_count": 1,
              "next_doc_count": 0,
              "score_count": 0,
              "build_scorer": 1331208,
              "advance": 32747232672,
              "advance_count": 35254294
            },

Jagdish_agarwal · May 7, 2018, 7:49am

Part 2

"children": [
                  {
                    "type": "TermQuery",
                    "description": "_type:__Clusters",
                    "time": "1850.491480ms",
                    "time_in_nanos": 1850491480,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 4236,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 53520,
                      "advance": 1848317232,
                      "advance_count": 2116489
                    }
                  },
                  {
                    "type": "TermQuery",
                    "description": "_type:__Clusters.InnerCLusters",
                    "time": "1857.463788ms",
                    "time_in_nanos": 1857463788,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2269,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 8509,
                      "advance": 1855326035,
                      "advance_count": 2126972
                    }
                  },
                  {
                    "type": "TermQuery",
                    "description": "_type:__Clusters.InnerCLusters.AttrStore2",
                    "time": "17757.78522ms",
                    "time_in_nanos": 17757785215,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2201,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 6878,
                      "advance": 17737491247,
                      "advance_count": 20284886
                    }
                  },
                  {
                    "type": "TermQuery",
                    "description": "_type:__Clusters.InnerCLusters.AttrStore3",
                    "time": "5027.549424ms",
                    "time_in_nanos": 5027549424,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2144,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 7084,
                      "advance": 5021800219,
                      "advance_count": 5739974
                    }
                  },
                  {
                    "type": "TermQuery",
                    "description": "_type:__Clusters.InnerCLusters.AttrStore1",
                    "time": "4427.707954ms",
                    "time_in_nanos": 4427707954,
                    "breakdown": {
                      "score": 0,
                      "build_scorer_count": 2,
                      "match_count": 0,
                      "create_weight": 2279,
                      "next_doc": 0,
                      "match": 0,
                      "create_weight_count": 1,
                      "next_doc_count": 0,
                      "score_count": 0,
                      "build_scorer": 8043,
                      "advance": 4422711652,
                      "advance_count": 4985977
                    }
                  }
                ]
              }
            ],
            "rewrite_time": 84186,
            "collector": [
              {
                "name": "CancellableCollector",
                "reason": "search_cancelled",
                "time": "5838.014395ms",
                "time_in_nanos": 5838014395,
                "children": [
                  {
                    "name": "SimpleTopScoreDocCollector",
                    "reason": "search_top_hits",
                    "time": "2053.911836ms",
                    "time_in_nanos": 2053911836
                  }
                ]
              }
            ]
          }
        ],
        "aggregations": []
      }

dadoonet · May 7, 2018, 8:20am

So you have a lot of nested docs here which is causing lot of internal "joins" at search time.
Few things I can think about:

First once the segments are loaded in memory (if you have enough memory left for the OS FS Cache), hopefully this will be much faster.
May be having more shards in that case with fewer documents per shard would help to reduce that time
Depending on your use case, don't use nested when not absolutely necessary.

But may be @jpountz has other ideas?

system · June 4, 2018, 8:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance Issue with Sub-Aggregations and Nested Document Structure? Elasticsearch	3	2328	March 27, 2017
Elasticsearch Performance Issue Elasticsearch	7	569	September 4, 2020
Schema optimization/alternative for nested objects Elasticsearch	6	516	May 18, 2023
Structure of the result is not readable Elasticsearch	9	1317	February 8, 2019
Nested object aggregation performance issues Elasticsearch	1	542	March 8, 2021

Poor Query Performance using nested document structure

Related topics