[OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

Herve Jourdain herve.jourdain at neuf.fr
Tue Jun 12 09:30:48 UTC 2018


Hi,

I believe I'm the "original author" of some patch attempt at tackling this problem, more than a year ago, as referenced in this series.
And I understand why everyone, Khem being the first and not the only one, would like some "simpler" things for ARM.
But the problem is that ARM-based SoCs are very diverse, and ARM does have a number of optional IP blocks (such as crypto, but neon is another one, and there are others), defined for each architecture. Then ARM defines some "standard" SoCs (like cortex-A53, cortex-A57, ...) which may set some of those optional IPs as required for that SoC, and the rest still as optional.
And SoC vendors decide what optional IPs they will implement or not...

So when we're talking "cortex-A53", it's not necessarily the same cortex-A53 for all SoC vendors.

GCC does support all that complexity. So the main question is, do we want to be able to generate code that could take advantage of the optional IPs present on a SoC? Or do we prefer to settle for the least common denominator?
As someone who is close to the SoC, I definitely would prefer to be able to take advantage of the optional IPs present on an ARM SoC, and I'd rather have a system that can at least support that even if it's slightly more complex. This said, once it's done, most people won't look under the hood but just use it, so the complexity would end up being hidden - much like now with armv7.

I've personally followed up on my patches from last year, and I now have a slightly modified/simplified version of them, which I've used to build some production-ready environments using cortex-a53/armv8 tunes, that trigger the optimization for cortex-a53 + neon. And if the SoC I'm working with had the crypto extension, I would be very happy to build for it, by just switching the tune I use for my cortex-a53 to the armv8 tune supporting crypto.

So I believe now may be a good time to talk this over again, because we're basically building for cortex-a53 with cortexa7/armv7ve, and that is not the most optimal thing to do in my opinion (like, some instructions that were native in armv7ve are simulated in armv8).

One thing that I did come up as a simplification was the handling of thumb, I don't think it needs to be an option anymore, since its support is mandatory in armv8 (but I think it was also the case in armv7). That simplifies things a bit, but nothing fundamental, you still need to carry the support for the optional IPs around...
And in addition to what I proposed to support last year, we indeed now have to add armv8.1a, armv8.2a, armv8.3a, armv8.4a (so far...), which each have their own specificities/differences that make it unlikely to be supported within a single file.

Thoughts? Can we talk this over, so we can have a chance to have a good support for armv8-32 in oe, instead of everyone doing its own?

Cheers,
Herve

-----Original Message-----
From: openembedded-core-bounces at lists.openembedded.org [mailto:openembedded-core-bounces at lists.openembedded.org] On Behalf Of Koen Kooi
Sent: mardi 12 juin 2018 11:01
To: Randy Li <ayaka at soulik.info>
Cc: OE-core <openembedded-core at lists.openembedded.org>
Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors



> Op 9 jun. 2018, om 08:26 heeft Randy Li <ayaka at soulik.info> het volgende geschreven:
> 
> I read the ARMv8 manual again, it looks the hardware float is 
> mandatory in Linux Distributions and toolchain libraries. Even some 
> cortex processors can be configured without FPU/NEON hardware, but I 
> don't think they would be used in openembeded core.
> 
> So I can assume the NEON(SIMD) would exist all the time. Leaving only 
> the crc and crypto instructions are optional here.
> 
> 
> Randy Li (4):
>  arch-armv8a.inc: add tune include for armv8
>  tune-cortexa35: add tunes for ARM Cortex-A35
>  tune-cortexa32: add tunes for ARM Cortex-A32
>  tune-cortexa72: add tunes for ARM Cortex-A72

Having been forced to deal with the mess that’s 32-bit arm tunes: Let’s only add an implementation specific tunes *after* having seem conclusive, repeatable benchmark results. 90% of the 32 bit tune files are placebo effect and just explode number of package archs in your distro feed. The goal of aarch64 was to stop being different for the sake of being different, let’s not make a mess because we are used to messes.

regards,

Koen
--
_______________________________________________
Openembedded-core mailing list
Openembedded-core at lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core




More information about the Openembedded-core mailing list