[OE-core] [PATCH 1/9] lib/oe/patch: handle non-UTF8 encoding when reading patches

Tue Sep 6 20:16:12 UTC 2016

Hi Enrico,

On Tue, 06 Sep 2016 17:50:02 Enrico Scholz wrote:
> Paul Eggleton <paul.eggleton-VuQAYsv1563Yd54FQh9/CA at public.gmane.org>
> writes:
> > When extracting patches from a git repository with PATCHTOOL = "git" we
> > cannot assume that all patches will be UTF-8 formatted, so as with other
> > places in this module, try latin-1 if utf-8 fails.
> 
> This will probably not work when patch contains a character between 128
> and 159 (which is a blackhole in all locales afaik).

I realise it's by no means perfect - you may even fairly label it a hack, 
since it's only handling two encodings out of many. However I was keen to at 
least restore the ability to handle the majority of patches we have in the 
core, we can always improve it subsequently (even before the release).

> I would read the file as a binary ('rb' instead of 'r') and make the
> GitApplyTree.* strings a 'bytes' type.

The code is not just passing the data through, it is actually processing it. 
If we did what you propose wouldn't it make that processing more difficult? 

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre