[bitbake-devel] [RFC][WIP][PATCHv1] lib/bb/checksum.py: Speed-up checksum gen when directory is git

Aníbal Limón anibal.limon at linaro.org
Tue Oct 8 18:45:12 UTC 2019


In some cases people/organizations are using a SRC_URI with
file:///PATH_TO_DIR that contains a git repository for different
reasons, it is useful when want to do an internal build without
clone sources from outside.

This could consume a lot of CPU time because the current taskhash
generation mechanism didn't identify that the folder is a VCS
(git, svn, cvs) and makes the cksum for every file including the
.git repository in this case.

There are different ways to improve the situation,

* Add protocol=gitscm in file:// SRC_URI but the taskhash is
  calculated before the fetcher identifies the protocol, will require
  some changes in bitbake codebase.
* This patch: When directory is a git repository (contains .git)
  use HEAD rev + git diff to calculate checksum instead of do it
  in every file, that is hackish because make some assumptions about
  .git directory contents.
* Variant of this patch: Make a list of VCS directories (.git, .svn,
  .cvs) and take out for cksum calculations, same as before making
  assumptions about the . folders content.

Signed-off-by: Aníbal Limón <anibal.limon at linaro.org>
---
 lib/bb/checksum.py | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/lib/bb/checksum.py b/lib/bb/checksum.py
index 5bc8a8fc..ee125cb5 100644
--- a/lib/bb/checksum.py
+++ b/lib/bb/checksum.py
@@ -86,6 +86,19 @@ class FileChecksumCache(MultiProcessCache):
             return checksum
 
         def checksum_dir(pth):
+            git_dir = os.path.join(pth, '.git')
+            if os.path.exists(git_dir):
+                import subprocess, hashlib
+                m = hashlib.md5()
+                head = subprocess.check_output("cd %s && git rev-parse HEAD" % pth, shell=True)
+                diff = subprocess.check_output("cd %s && git diff" % pth, shell=True)
+                m.update(head)
+                if diff:
+                    m.update(diff)
+
+                return [(pth, m.hexdigest())]
+
+
             # Handle directories recursively
             if pth == "/":
                 bb.fatal("Refusing to checksum /")
-- 
2.23.0



More information about the bitbake-devel mailing list