[OE-core] Tune files and knobs to turn

Khem Raj raj.khem at gmail.com
Tue Jun 28 20:37:20 UTC 2011


On Tue, Jun 28, 2011 at 1:33 PM, Koen Kooi <koen at dominion.thruhere.net> wrote:
>
> Op 28 jun 2011, om 22:31 heeft Khem Raj het volgende geschreven:
>
>> On Tue, Jun 28, 2011 at 10:36 AM, Darren Hart <dvhart at linux.intel.com> wrote:
>>>
>>>
>>> On 06/24/2011 04:54 AM, Koen Kooi wrote:
>>>> Hi,
>>>>
>>>> We discussed tune files a bit during last nights TSC meeting and Khem had
>>>> expressed the need before, so I'd like to get this discussion started by using
>>>> armv7a as an example.
>>>>
>>>> For armv7a capable cores we have the following hardware features:
>>>>
>>>> * armv7a instruction set
>>>> * thumb1 instruction set
>>>> * thumb2 instruction set
>>>> * VFP coprocessor
>>>> * optional NEON coprocessor
>>>>
>>>> For the ABI we can choose the following:
>>>>
>>>> * softtp without hw support (e.g. no VFP instructions emitted, slow)
>>>> * softfp with hw support (e.g. VFP and/or NEON instructions emitted, fast)
>>>> * hardfp, emits VFP and/or NEON instructions, slightly faster than softfp/hw,
>>>>   incompatible with everything else
>>>>
>>>> And the extra knobs:
>>>>
>>>> * pure thumb1, no arm instructions (limited use)
>>>> * thumb1/arm interworking
>>>> * pure thumb2,  no arm instructions
>>>> * thumb2 interworking (not sure if that's actually usefull, thumb2 has complete coverage)
>>>>
>>>> In OE .dev we have the following vars:
>>>>
>>>> TARGET_FPU: switches between hw float and sw float, no reflection in package arch
>>>> ARM_FP_ABI: switches between softfp and hardfp, will create 'armv7a' or
>>>>             'armv7a-hardfp' as package arch
>>>> ARM_INSTRUCTION_SET: switches between arm and thumb1, no reflection in package arch
>>>> THUMB_INTERWORK: turns on interworking, no reflection in package arch
>>>>
>>>> (side note, oe-core/distroless and meta-yocto/poky don't turn set TARGET_FPU
>>>>  for armv7a and will generate slow code, angstrom does turn it on)
>>>
>>>
>>> oe-core tune-cortexa8.inc doesn't make use of these variables (unlike
>>> meta-texasinstruments) and does make use of the neon coprocessor, but
>>> still uses the softfp float-api:
>>>
>>> TARGET_CC_ARCH = "-march=armv7-a -mtune=cortex-a8 -mfpu=neon
>>> -mfloat-abi=softfp -fno-tree-vectorize"
>>>
>>> Seems like the oe-core tune files need to be synced up with vendor layers?
>>>
>>
>> Well for enabling hardfp its a fundamental decision and I guess using softfloat
>> in oe-core is probably best choice and the floating point parameter passing ABI
>> I am taking about we still use -mfpu=neon so gcc will still try to utilize it
>> but -fno-tree-vectorize is going to subdue the use of neon intrs since gcc
>> is disallowed to vectorize
>
> Experience has shown that -fno-tree-vectorize generates faster code with gcc 4.5 :)

Someday I will try to benchmark and find out whats going on for myself.




More information about the Openembedded-core mailing list