[OE-core] [PATCH] buildhistory.bbclass: Specify lang in decoding strings

Thu Jan 23 18:59:09 UTC 2014

On Thu, Jan 23, 2014 at 10:26 AM, Paul Eggleton
<paul.eggleton at linux.intel.com> wrote:
> On Saturday 28 December 2013 22:52:18 Paul Eggleton wrote:
>> On Friday 06 December 2013 16:11:54 Khem Raj wrote:
>> > On systems where default locale is utf-8 we get errors like
>> >
>> > File: 'buildhistory.bbclass', lineno: 38, function: write_pkghistory
>> > 0034: if pkginfo.rconflicts:
>> > 0035: f.write("RCONFLICTS = %s\n" % pkginfo.rconflicts)
>> > 0036: f.write("PKGSIZE = %d\n" % pkginfo.size)
>> > 0037: f.write("FILES = %s\n" % pkginfo.files)
>> > *** 0038: f.write("FILELIST = %s\n" % pkginfo.filelist)
>> > 0039:
>> > 0040: for filevar in pkginfo.filevars:
>> > 0041: filevarpath = os.path.join(pkgpath, "latest.%s" % filevar)
>> > 0042: val = pkginfo.filevars[filevar]
>> > Exception: UnicodeEncodeError: 'ascii' codec can't encode character
>> > u'\xed' in position 337: ordinal not in range(128)
>> >
>> > This patch specifies decode to use utf-8 so ascii and utf-8 based
>> > locales both work
>> >
>> > Signed-off-by: Khem Raj <raj.khem at gmail.com>
>> > ---
>> >
>> >  meta/classes/buildhistory.bbclass |    2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/meta/classes/buildhistory.bbclass
>> > b/meta/classes/buildhistory.bbclass index 1e6d968..4ff39a0 100644
>> > --- a/meta/classes/buildhistory.bbclass
>> > +++ b/meta/classes/buildhistory.bbclass
>> > @@ -190,7 +190,7 @@ python buildhistory_emit_pkghistory() {
>> >
>> >                  key = item[0]
>> >
>> >                  if key.endswith('_' + pkg):
>> >                      key = key[:-len(pkg)-1]
>> >
>> > -                pkgdata[key] = item[1].decode('string_escape')
>> > +                pkgdata[key] = item[1].decode('utf-8', 'string_escape')
>> >
>> >          pkge = pkgdata.get('PKGE', '0')
>> >          pkgv = pkgdata['PKGV']
>>
>> Khem, did you test that this actually works? Here it does not - I get
>> strings with \n \t in them; reverting this change makes it interpret these
>> as it should.
>
> Unless I'm misunderstanding, I think the second parameter here is wrong, since
> according to the Python docs it's supposed to specify what to do when an error
> occurs, not specify another encoding to decode. Should it be
> .decode('utf-8').decode('string_escape') instead?
>

Yes that seems more likely the right thing to do.

> The problem that I have is I can't reproduce the failure that you observed, so
> I need you to help me to fix this properly.

I see the failure on a newly installed debian wheezy with no locale's installed.

>
> Thanks,
> Paul
>
> --
>
> Paul Eggleton
> Intel Open Source Technology Centre