[bitbake-devel] [RFC][WIP][PATCHv1] lib/bb/checksum.py: Speed-up checksum gen when directory is git

richard.purdie at linuxfoundation.org richard.purdie at linuxfoundation.org
Tue Oct 8 23:15:50 UTC 2019


On Tue, 2019-10-08 at 13:45 -0500, Aníbal Limón wrote:
> In some cases people/organizations are using a SRC_URI with
> file:///PATH_TO_DIR that contains a git repository for different
> reasons, it is useful when want to do an internal build without
> clone sources from outside.
> 
> This could consume a lot of CPU time because the current taskhash
> generation mechanism didn't identify that the folder is a VCS
> (git, svn, cvs) and makes the cksum for every file including the
> .git repository in this case.
> 
> There are different ways to improve the situation,
> 
> * Add protocol=gitscm in file:// SRC_URI but the taskhash is
>   calculated before the fetcher identifies the protocol, will require
>   some changes in bitbake codebase.
> * This patch: When directory is a git repository (contains .git)
>   use HEAD rev + git diff to calculate checksum instead of do it
>   in every file, that is hackish because make some assumptions about
>   .git directory contents.
> * Variant of this patch: Make a list of VCS directories (.git, .svn,
>   .cvs) and take out for cksum calculations, same as before making
>   assumptions about the . folders content.

This is an interesting one.

File checksums are added to the hashes "late" so that we don't have to
reparse entire recipes when files change. We do need a mechanism to
know when we need to reparse the checksum. I think this means you can
skip the checksum calculation for each file but you do still end up
having to stat all files in the tree separately for bitbake's tracking
and for git. We also have to notice when new files are added.

As such I'm not convinced this patch will work correctly (e.g. would it
notice if I copy in a new file to the directory untracked by git). 

A first step may be to add some further tests to bitbake-selftest to
better cover this area...

Cheers,

Richard







More information about the bitbake-devel mailing list