[OE-core] [RFC 0/9] Hash Equivalency Server

Joshua Watt jpewhacker at gmail.com
Mon Jul 16 20:37:19 UTC 2018


These patches are a first pass at implementing a hash equivalence server
in bitbake & OE.

Apologies for cross-posting this to both the bitbake-devel and
openembedded-devel; this work necessarily intertwines both places, and
it is really necessary to look at both parts to get an idea of what is
going on. For convenience, the bitbake patches are listed first,
followed by the oe-core patches.

The basic premise is that any given task no longer hashes a dependent
task's taskhash to determine it's own taskhash, but instead hashes the
dependent task's "dependency ID" (which doesn't strictly need to be a
hash, but is for consistency. We can have the discussion as to whether
this should be called a "dependency hash" if anyone wants). This allows
multiple taskhashes to map to the same dependency ID, meaning that
trivial changes to a recipe that would change the taskhash don't
necessarily need to change the dependency ID, and thus don't need to
cause downstream tasks to be rebuilt (with caveats, see below).

In the absence of any interaction by the user, the dependency ID for a
task is just that task's taskhash, which effectively maintains the
current behavior. However, if the user enables the "OEEquivHash"
signature generator, they can direct it to look at a hash equivalency
server (of which a reference implementation is provided). The sstate
code will provide the server with an output hash that it calculates, and
the server will record all tasks with the same output hash as
"equivalent" and report the same dependency ID for them when requested.
When initializing tasks, bitbake can ask the server about the dependency
ID for new tasks it has never seen before and potentially skip
rebuilding, or restore the task from an equivalent sstate file. To
facilitate restoring tasks from sstate, sstate objects are now named
based on the tasks dependency ID instead of the taskhash (which, again
has no effect if the server is in use).

This patchset doesn't make any attempt to dynamically update task
dependency IDs after bitbake initializes the tasks, and as such there
are some cases where this isn't accelerating the build as much as it
possibly could. I think it will be possible to add support for this, but
this preliminary support needs to come first.

Some patches have additional NOTEs that indicate places where I wasn't
sure what to do.

You can also see these patches (and my first attempts at dynamic task
re-hashing) on the "jpew/hash-equivalence" branch in poky-contrib.

As always, thanks for your feedback and time

Joshua Watt (9):
  bitbake-worker: Pass taskhash as runtask parameter
  siggen: Split out stampfile hash fetch
  siggen: Split out task depend ID
  runqueue: Track task dependency ID
  runqueue: Pass dependency ID to task
  runqueue: Pass dependency ID to hash validate
  classes/sstate: Handle depid in hash check
  hashserver: Add initial reference server
  sstate: Implement hash equivalence sstate

 bitbake/bin/bitbake-worker            |   9 +-
 bitbake/contrib/hashserver/.gitignore |   2 +
 bitbake/contrib/hashserver/Pipfile    |  15 ++
 bitbake/contrib/hashserver/app.py     | 212 ++++++++++++++++++++++++++
 bitbake/lib/bb/runqueue.py            |  56 ++++---
 bitbake/lib/bb/siggen.py              |  20 ++-
 meta/classes/sstate.bbclass           | 102 +++++++++++--
 meta/conf/bitbake.conf                |   4 +-
 meta/lib/oe/sstatesig.py              | 166 ++++++++++++++++++++
 9 files changed, 544 insertions(+), 42 deletions(-)
 create mode 100644 bitbake/contrib/hashserver/.gitignore
 create mode 100644 bitbake/contrib/hashserver/Pipfile
 create mode 100755 bitbake/contrib/hashserver/app.py

--
2.17.1




More information about the Openembedded-core mailing list