[OE-core] [PATCH] python3: enable profile optimized builds

Anuj Mittal anuj.mittal at intel.com
Fri Aug 17 04:48:27 UTC 2018


On 08/17/2018 03:31 AM, Andre McCurdy wrote:
> On Wed, Aug 15, 2018 at 11:26 PM, Anuj Mittal <anuj.mittal at intel.com> wrote:
>> Enable profile guided optimization (pgo) for python3. Enabling pgo in
>> python is generally as simple as invoking the target profile-opt which:
>>
>> - builds python binaries with profile instrumentation enabled,
>> - runs a specific profile task using that python to get the profile
>> data and,
>> - feeds the compiler with this profile data and rebuilds python.
>>
>> This change invokes qemu-user for the second step of running a profile
>> task using target python. Depending on how long profile task takes to
>> run, this might add a significant time to compilation (which would be
>> true for native builds too). The default profile task can be changed by
>> the users depending on what makes sense for their use case (or can be
>> left empty). In case qemu-user isn't supported, profile task won't be run.
> 
> Is it important to re-create the profile data during every build or
> would we get most of the same benefits from using reference data which
> is generated offline? 

We should get the same benefit using the data generated offline as long
as the source code/configure options/flags are same I believe. I have
only tried with data generate offline using the same build configuration
though.

It would however need tweaking of the Makefiles to pass
-fprofile-dir=<path> while using the profile data among other things.
Please see this if you'd like an example that works:

https://git.yoctoproject.org/clean/cgit.cgi/poky-contrib/commit/?h=anujm/9338&id=e57654cb51b121e9dfa76e66432c4d37fd339d42

> How big is the data file? Is it binary or text?

gcc -fprofile-generate generates .gcda files which are used only for
profile use and can be deleted later and aren't installed. For more
information:

https://gcc.gnu.org/onlinedocs/gcc/Gcov-Data-Files.html

> Is the data expected to be target architecture specific?>
> If reference data were used, what are the consequences of the data not
> corresponding exactly to the current build configuration? A build
> failure? Or just a decrease in the effectiveness of the optimisation?

>From gcc man page:

"By default, GCC emits an error message if the feedback profiles do not
match the source code. This error can be turned into a warning by using
-Wcoverage-mismatch. Note this may result in poorly optimized code."

> 
> Does the profiling instrumentation measure execution timing? Or only
> the frequency / order in which functions are called? ie is there any
> concern that data generated from running under qemu may not be optimal
> for running on the target?

It tries to identify code hot spots, how many times each branch and call
is executed and how many times it is taken or returns etc. so I don't
think it should matter. I did try on target hardware and using qemu as
well and at least performance wise, I didn't see any difference. I
didn't perform any exhaustive analysis though.




More information about the Openembedded-core mailing list