[OE-core] [PATCH 1/3] utils/md5_file: don't iterate line-by-line

Burton, Ross ross.burton at intel.com
Mon Aug 13 18:04:43 UTC 2018


Yeah, just sent it there,  sorry

On 13 August 2018 at 19:03, akuster808 <akuster808 at gmail.com> wrote:
>
>
> On 08/13/2018 10:20 AM, Ross Burton wrote:
>> Opening a file in binary mode and iterating it seems like the simple solution
>> but will still break on newlines, which for binary files isn't really useful as
>> the size of the chunks could be huge or tiny.
>>
>> Instead, let's be a bit more clever: we'll be MD5ing lots of files, but we don't
>> want to fill up memory: use mmap() to open the file and read the file in 8k
>> blocks.
>>
>> Signed-off-by: Ross Burton <ross.burton at intel.com>
>
> shouldn't this go to the bitbake mailing list ?
>> ---
>>  bitbake/lib/bb/utils.py | 13 +++++++++----
>>  1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/bitbake/lib/bb/utils.py b/bitbake/lib/bb/utils.py
>> index 9903183213b..b20cdabcf01 100644
>> --- a/bitbake/lib/bb/utils.py
>> +++ b/bitbake/lib/bb/utils.py
>> @@ -524,12 +524,17 @@ def md5_file(filename):
>>      """
>>      Return the hex string representation of the MD5 checksum of filename.
>>      """
>> -    import hashlib
>> -    m = hashlib.md5()
>> +    import hashlib, mmap
>>
>>      with open(filename, "rb") as f:
>> -        for line in f:
>> -            m.update(line)
>> +        m = hashlib.md5()
>> +        try:
>> +            with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as mm:
>> +                for chunk in iter(lambda: mm.read(8192), b''):
>> +                    m.update(chunk)
>> +        except ValueError:
>> +            # You can't mmap() an empty file so silence this exception
>> +            pass
>>      return m.hexdigest()
>>
>>  def sha256_file(filename):
>



More information about the Openembedded-core mailing list