We have a ~100TB logs cluster with 26 nodes that currently has about 20k shards in total. We had previously been running 7.6 and upgraded to 7.17.4.
We're running into problems with very slow initial load and refresh. While we probably have some work to do with the queries, the real issues seems to be UI. I can see the UI thread in chrome get pegged to 100% for many seconds before any network activity is started. There is a similar big pause when attempting to refresh.
Enabling discover:searchFieldsFromSource seems to make this problem go away (cuts initial load time in half and removes most of the UI lag). However, with that option enabled we lose the ability to use subfields on objects as columns in the results.
Looking at profiles there are a few things that stand out:
In both profiles the
_fields_for_wildcard network call takes 10 seconds. Any pointers on what we can do to shorten that?
With discover:searchFieldsFromSource disabled there are two 7-10s pauses where the browser spends time in a
Hn.uniqWith call inside
flatten. This seems to be the main issue.
Any ideas for how to track down the scaling limitation we're running into?
If you go to "Stack management > index patterns > (your index pattern)" how many fields are reported? This sounds a lot like a mapping explosion (due to dynamic mapping and a change in ingested data, lots and lots of dynamic fields have been created) which is slowing down various parts across the stack.
Good call. On our old cluster right now we have 12658, but in the new cluster we have > 45k for the same logs. That's a good lead.
It's often a good idea to turn off dynamic mapping in the production system to avoid this - of course this comes with all downsides of a strict allow list (any new field has to be added to the mapping explicitly)
+1 for turning off dynamic mapping
thx for providing details about
uniqWith @Brian_Turnbull , opened an issue for that, because here I think we can improve performance: Improve performance of searchsource flattening · Issue #136854 · elastic/kibana · GitHub
Yeah, that's on our long-term todo list. This gives us something to work with in the meantime though. We can probably get by with some smart flattening and dynamic templates.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.