[OE-core] [PATCH] sstate: Truncate PV in sstate filenames that are too long

Mark Hatle mark.hatle at windriver.com
Tue Jul 30 14:14:01 UTC 2019


On 7/30/19 8:49 AM, Mike Crowe wrote:
> On Tuesday 30 July 2019 at 08:25:52 -0500, Mark Hatle wrote:
>> On 7/30/19 6:01 AM, Mike Crowe wrote:
>>> sstate filenames are generated by concatenating a variety of bits of
>>> package metadata. Some of these parts could be long, which could cause
>>> the filename to be longer than the 255 character maximum for ext4.
>>>
>>> So, let's try to detect this situation and truncate the PV part of the
>>> filename so that it will fit. If this happens, an ellipsis is added to
>>> make it clear that the version number is incomplete.
>>>
>>> SSTATE_PKG needs to be consistent for all tasks so that the hash
>>> remains stable. This means that we need to make an assumption for the
>>> maximum length of the task name. In this implementation, the task name
>>> is limited to 27 characters.
>>>
>>> This change also results in a sensible error message being emitted if
>>> the resulting filename is still too long.
>>>
>>> Signed-off-by: Mike Crowe <mac at mcrowe.com>
>>>
>>> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
>>> index 3342c5ef50..6313b1c538 100644
>>> --- a/meta/classes/sstate.bbclass
>>> +++ b/meta/classes/sstate.bbclass
>>> @@ -8,6 +8,24 @@ def generate_sstatefn(spec, hash, d):
>>>          hash = "INVALID"
>>>      return hash[:2] + "/" + spec + hash
>>>  
>>> +def sstate_path(taskname, d):
>>> +    max_filename_len = 245 # leave some room for ".siginfo"
>>> +    max_addendum_len = 32 # '_' + taskname + '.tgz'
>>
>> Since the task name is variable, is there really a 32 character limit here?
>>
>> It may make sense to do:
>>
>> # '_' + taskname + '.tgz', reserving a minimum of 32 for taskname
>> max_addendum_len = len(taskname) + 5 if len(taskname) + 5 > 32 else 32
>>
>> Always reserve a minimum of 32 for consistency, but if we go over account
>> for it.
> 
> I think that would just cause task hash mismatches (see third paragraph of
> commit message.)
> 
> It probably does make sense to detect such long task names in this
> situation and generate errors though.
> 
>>> +    sstate_prefix = d.getVar('SSTATE_PKG')
>>> +    excess = len(os.path.basename(sstate_prefix)) - (max_filename_len - max_addendum_len)
>>> +    if excess > 0:
>>> +        pv = d.getVar('PV')
>>> +        if len(pv) >= excess and len(pv) >= 3:
>>> +            short_pv = d.getVar('PV')[:-excess-3] + '...'
>>
>> Is truncating the PV enough?  In a discussion on the bitbake list, I suggested
>> possibly changing the order of the entries in the SSTATE_PKGSPEC to allow us to
>> prune things prior to the hash w/o affecting the hash.  Maybe this is simply not
>> needed.. but it's a possibility if this proves to not be effective.
> 
> Truncating PV solves the problem I was having. The other fields don't
> really tend to be very long, so there's less to be gained by shortening
> them.
> 
>>> +            d2 = d.createCopy()
>>> +            d2.setVar('PV', short_pv)
>>> +            sstate_prefix = d2.getVar('SSTATE_PKG')
>>> +
>>> +    sstatepkg = sstate_prefix + '_'+ taskname + ".tgz"
>>> +    if len(os.path.basename(sstatepkg)) > max_filename_len:
>>> +        bb.error('Failed to shorten sstate filename')
>>> +    return sstatepkg
>>> +
>>>  SSTATE_PKGARCH    = "${PACKAGE_ARCH}"
>>>  SSTATE_PKGSPEC    = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:"
>>>  SSTATE_SWSPEC     = "sstate:${PN}::${PV}:${PR}::${SSTATE_VERSION}:"
>>
>> There is something else I noticed..  "SSTATE_PKGNAME" defined as:
>>
>> SSTATE_PKGNAME    =
>> "${SSTATE_EXTRAPATH}${@generate_sstatefn(d.getVar('SSTATE_PKGSPEC'),
>> d.getVar('BB_UNIHASH'), d)}"
>>
>> From what I can tell, this really should be using the new sstate_path function
>> in someway.
> 
> Hmm, you're right. I can't have covered that in my testing. :(
> 
>> Would it make more sense to define SSTATE_PKGNAME in such a way that it always
>> resulted in something "short" enough, and in the right format, that it would
>> always work?
> 
> I'd considered doing it that way. If SSTATE_OKGSPEC contained ${SSTATE_PV}
> and that either had the value of ${PV} or the a truncated version of ${PV}
> then the rest of the file could remain the same. However, truncating PV
> without access to the rest of the spec would mean just picking some
> arbitrary maximum PV length which is likely to be more conservative than
> necessary.
> 
>> Adjusting or rewriting "generate_sstatefn" could still accomplish the PV change,
>> but the max length of the string would need further shrinking to accommodate an
>> unknown task length (which goes back to my previous comment).  If the 32 default
>> is long enough then that shouldn't be a problem -- and may also resolve my
>> concerns that something outside of sstate class could try to use that various
>> and without the new magic function get the wrong results.
> 
> I wonder whether I can get away with applying per-task PV truncation in
> generate_sstatefn without causing hash mismatches? That's worth a try.

The only cause of a task hash mismatch (actual hash, not filename) would be in
SSTATE_PKGNAME is part of the hash itself.  I'd contend if it is, then we should
exclude it.

-All- of the components of the name itself are already included in the hash.  So
why should we (or the system) care that SSTATE_PKGNAME has a specific value or not?

The important items in the sstate hash name, as far as I'm concerned, are:

sstate (literal), sstate version, the hash, and the 'type'. (siginfo, tgz, etc)

Everything else there is for the user to be able to look at the filename and
determine what it is.  It could be used to avoid collisions if we have two items
with the same hash, but otherwise different contents -- but with sha256 this
should be almost impossible.

Richard mentioned in the bitbake-devel thread that there may be external tools
using some of the components.  I'm not sure how to even identify what those
tools are at this time, but a new sstate version entry may be enough to start to
deal with them.

--Mark

> Thanks for your comments. They've been very helpful.
> 
> Mike.
> 



More information about the Openembedded-core mailing list