Parent-child performance issue

Thank you for your answer,

If I understand well, if I move to eager global ordinals loading, it will only speed up the first parent/child query? Am I right?

Good to know, thank you. I will try to analyze if my queries are closed to the worst case or not.

I think I know the answer, but can I expect best performance with more shards, without adding more nodes? What is the limiting hardware for parent/child query processing? Is there a way to see what is the bottleneck on the hardware point of view?

Thank you,