Continuous Profile-Guided Optimization (PGO) platform based on Elastic Universal Profiling

Hi!

I evaluate multiple approaches to applying Profile-Guided Optimization (PGO) to the IT ecosystem (if you are interested you can check the results here). According to my tests, PGO helps with achieving better performance for many existing applications.

There are two major kinds of PGO: Instrumentation and Sampling (about both of them you can read the Clang compiler documentation). Despite Instrumentation-based PGO is much more well-known in the community (from my experience), it has major drawbacks. One of the biggest drawbacks is the performance overhead during the profiling phase.

To resolve this issue, Google invented Sampling PGO (also sometimes called AutoFDO (GitHub - google/autofdo: AutoFDO)). This approach uses perf-based profiling to collect the PGO profiles directly from the production environment, and then use them during the optimization phase. More details about the whole ecosystem around AutoFDO in Google can be found in their paper.

Unfortunately, right now there is no open-source ecosystem to support the AutoFDO approach at scale as it's done in Google. But the Elastic Universal Profiling platform looks like an important piece in this pipeline (we can say it could replace Google Wide Profiler (GWP) in the AutoFDO setup in Google).

My idea is to think about building the same system as Google has internally but based on Elastic Universal Profiling. It could be an interesting opportunity for Elastic to become a unique solution in the continuous optimization area.

It would be great to hear thoughts about the idea from the Elastic devs.

Thank you for your attention.

P.S. Similar discussion in the Grafana Pyroscope project: Continuous Profile-Guided Optimization (PGO) platform based on Grafana Pyroscope · grafana/pyroscope · Discussion #2783 · GitHub

1 Like

Hi Alexander,

thank you for the suggestion -- it's a good one!

At optimyze we actually researched this quite a bit a few years ago and we've had working prototypes for generating both LLVM's and gcc's sampling profile format directly from samples collected by the profiler before being acquired by Elastic. With the acquisition we have shifted focus to more pressing issues like migrating backends and didn't have time to look into this again afterwards, but it's definitely still on our roadmap.

I can generally confirm what you are saying: it worked great and we saw some pretty serious performance improvements after recompiling various standard software with PGO generated from these prototypes. PGO support in LLVM and particularly BOLT have also improved a lot since we looked into this, so I expect that the improvements will be even larger now.

2 Likes

This topic was automatically closed after 28 days. New replies are no longer allowed.